Breeding: python - Memory-efficent way to iterate over part of a large file -

Saturday, 15 September 2012

python - Memory-efficent way to iterate over part of a large file -

i avoid reading files this:

with open(file) f: list_of_lines = f.readlines()

and utilize type of code instead.

f = open(file) line in file: #do

unless have iterate on few lines in file (and know lines are) think easier take slices of list_of_lines. has come bite me. have huge file (reading memory not possible) don't need iterate on of lines few of them. have code completed finds first line , finds how many lines after need edit. don't have nay thought how write loop.

n = #grep number of lines  start = #pattern match start line  f=open('big_file') #some loop on f start o start + n       #edit lines

edit: title may have lead debate rather answer.

if understand question correctly, problem you're encountering storing all lines of text in list , taking piece uses much memory. want read file line-by-line, while ignoring set of lines (say, lines [17,34) example).

try using enumerate maintain track of line number you're on iterate through file. here generator-based approach uses yield output interesting lines 1 @ time:

def read_only_lines(f, start, finish):     ii,line in enumerate(f):         if ii>=start , ii<finish:             yield line         elif ii>=finish:              homecoming  f = open("big text file.txt", "r") line in read_only_lines(f, 17, 34):     print line

this read_only_lines function reimplements itertools.islice standard library, utilize create more compact implementation:

from itertools import islice line in islice(f, 17, 34): print line

if want capture lines of involvement in list rather generator, cast them list:

from itertools import islice lines_of_interest = list( islice(f, 17, 34) )  do_something_awesome( lines_of_interest ) do_something_else( lines_of_interest )

python iteration large-files

Breeding

Saturday, 15 September 2012

python - Memory-efficent way to iterate over part of a large file -

No comments:

Post a Comment