I would probably just use itertools.islice
. Using islice over an iterable like a file handle means the whole file is never read into memory, and the first 4002 lines are discarded as quickly as possible. You could even cast the two lines you need into a list pretty cheaply (assuming the lines themselves aren't very long). Then you can exit the with
block, closing the filehandle.
from itertools import islice
with open('afile') as f:
lines = list(islice(f, 4003, 4005))
do_something_with(lines)
Update
But holy cow is linecache faster for multiple accesses. I created a million-line file to compare islice and linecache and linecache blew it away.
>>> timeit("x=islice(open('afile'), 4003, 4005); print next(x) + next(x)", 'from itertools import islice', number=1)
4003
4004
0.00028586387634277344
>>> timeit("print getline('afile', 4003) + getline('afile', 4004)", 'from linecache import getline', number=1)
4002
4003
2.193450927734375e-05
>>> timeit("getline('afile', 4003) + getline('afile', 4004)", 'from linecache import getline', number=10**5)
0.14125394821166992
>>> timeit("''.join(islice(open('afile'), 4003, 4005))", 'from itertools import islice', number=10**5)
14.732316970825195
Constantly re-importing and re-reading the file:
This is not a practical test, but even re-importing linecache at each step it's only a second slower than islice.
>>> timeit("from linecache import getline; getline('afile', 4003) + getline('afile', 4004)", number=10**5)
15.613967180252075
Conclusion
Yes, linecache is faster than islice for all but constantly re-creating the linecache, but who does that? For the likely scenarios (reading only a few lines, once, and reading many lines, once) linecache is faster and presents a terse syntax, but the islice
syntax is quite clean and fast as well and doesn't ever read the whole file into memory. On a RAM-tight environment, the islice
solution may be the right choice. For very high speed requirements, linecache may be the better choice. Practically, though, in most environments both times are small enough it almost doesn't matter.