File reading using delimiters

Peter Hansen peter at engcorp.com
Mon Jun 9 16:13:24 EDT 2003


Ben S wrote:
> 
> In this case, I'm reading plain ASCII text files ranging in size from
> 10K to maybe 1Mb, where strings are delimited with the tilde character.
> But I'm asking as much about the available functionality as I am about
> my particular problem. I suppose I could use read() then split() and see
> how the performance works out. I'm surprised if there's nothing that
> lets me read more selectively from the file though.

My first approach for a problem like that would probably be (since
you said "arbitrary delimiters") to do a simple .read() to get the entire
file into memory, then use .split() if the delimiter is simple (i.e.
does not occur in the data itself) or maybe re.split() if the delimiter
is more complicated.

If that wasn't enough, I'd need to know more about the nature of the
arbitrary delimiters that are expected.

So to make more progress, is there anything wrong with this?

data = file('filename').read()
blocks = data.split('~')
for block in blocks:
    # do the algorithm on a block here


-Peter




More information about the Python-list mailing list