Question about file objects...

Terry Reedy tjreedy at udel.edu
Wed Dec 2 18:56:21 EST 2009


J wrote:
> On Wed, Dec 2, 2009 at 09:27, nn <pruebauno at latinmail.com> wrote:
>>> Is there a way to read the file, one item at a time, delimited by
>>> commas WITHOUT having to read all 16,000 items from that one line,
>>> then split them out into a list or dictionary??
> 
>> File iteration is a convenience since it is the most common case. If
>> everything is on one line, you will have to handle record separators
>> manually by using the .read(<number_of_bytes>) method on the file
>> object and searching for the comma. If everything fits in memory the
>> straightforward way would be to read the whole file with .read() and
>> use .split(",") on the returned string. That should give you a nice
>> list of everything.
> 
> Agreed. The confusion came because the guy teaching said that
> iterating the file is delimited by a carriage return character...

If he said exactly that, he is not exactly correct. File iteration looks 
for line ending character(s), which depends on the system or universal 
newline setting.

> which to me sounds like it's an arbitrary thing that can be changed...
> 
> I was already thinking that I'd have to read it in small chunks and
> search for the delimiter i want...  and reading the whole file into a
> string and then splitting that would would be nice, until the file is
> so large that it starts taking up significant amounts of memory.
> 
> Anyway, thanks both of you for the explanations... I appreciate the help!

I would not be surprised if a generic file chunk generator were posted 
somewhere. It would be a good entry for the Python Cookbook, if not 
there already.

tjr




More information about the Python-list mailing list