python - Memory-efficent way to iterate over part of a large file -
i avoid reading files this:
with open(file) f: list_of_lines = f.readlines()
and utilize type of code instead.
f = open(file) line in file: #do
unless have iterate on few lines in file (and know lines are) think easier take slices of list_of_lines. has come bite me. have huge file (reading memory not possible) don't need iterate on of lines few of them. have code completed finds first line , finds how many lines after need edit. don't have nay thought how write loop.
n = #grep number of lines start = #pattern match start line f=open('big_file') #some loop on f start o start + n #edit lines
edit: title may have lead debate rather answer.
if understand question correctly, problem you're encountering storing all lines of text in list , taking piece uses much memory. want read file line-by-line, while ignoring set of lines (say, lines [17,34)
example).
try using enumerate
maintain track of line number you're on iterate through file. here generator-based approach uses yield
output interesting lines 1 @ time:
def read_only_lines(f, start, finish): ii,line in enumerate(f): if ii>=start , ii<finish: yield line elif ii>=finish: homecoming f = open("big text file.txt", "r") line in read_only_lines(f, 17, 34): print line
this read_only_lines
function reimplements itertools.islice
standard library, utilize create more compact implementation:
from itertools import islice line in islice(f, 17, 34): print line
if want capture lines of involvement in list rather generator, cast them list:
from itertools import islice lines_of_interest = list( islice(f, 17, 34) ) do_something_awesome( lines_of_interest ) do_something_else( lines_of_interest )
python iteration large-files
No comments:
Post a Comment