String Fomat Conversion
Jeff Shannon
jeff at ccvcorp.com
Thu Jan 27 14:34:31 EST 2005
Stephen Thorne wrote:
> On Thu, 27 Jan 2005 00:02:45 -0700, Steven Bethard
> <steven.bethard at gmail.com> wrote:
>
>>By using the iterator instead of readlines, I read only one line from
>>the file into memory at once, instead of all of them. This may or may
>>not matter depending on the size of your files, but using iterators is
>>generally more scalable, though of course it's not always possible.
>
> I just did a teensy test. All three options used exactly the same
> amount of total memory.
I would presume that, for a small file, the entire contents of the
file will be sucked into the read buffer implemented by the underlying
C file library. An iterator will only really save memory consumption
when the file size is greater than that buffer's size.
Actually, now that I think of it, there's probably another copy of the
data at Python level. For readlines(), that copy is the list object
itself. For iter and iter.next(), it's in the iterator's read-ahead
buffer. So perhaps memory savings will occur when *that* buffer size
is exceeded. It's also quite possible that both buffers are the same
size...
Anyhow, I'm sure that the fact that they use the same size for your
test is a reflection of buffering. The next question is, which
provides the most *conceptual* simplicity? (The answer to that one, I
think, depends on how your brain happens to see things...)
Jeff Shannon
Technician/Programmer
Credit International
More information about the Python-list
mailing list