Parsing/Splitting Line

John Machin sjmachin at lexicon.net
Tue Nov 21 14:23:12 EST 2006


Neil Cerutti wrote:
> On 2006-11-21, acatejr at gmail.com <acatejr at gmail.com> wrote:
> > I have a text file and each line is a list of values.  The
> > values are not delimited, but every four characters is a value.
> > How do I get python to split this kind of data?  Thanks.
>
> Check out _Text Processing in Python_, Chapter 2, "PROBLEM:
> Column statistics for delimited or flat-record files".
> URL:http://gnosis.cx/TPiP/

Hmmmm ... the elementary notion "do line[start:end] in a loop" is well
buried, just behind this:

                  # Adjust offsets to Python zero-based indexing,
                  # and also add final position after the line
                  num_positions = len(self.column_positions)
                  offsets = [(pos-1) for pos in self.column_positions]
                  offsets.append(len(line))

Folk who are burdened with real-world flat files (example: several
hundred thousand lines each of 996 bytes wide) might want to consider
moving the set-up of "offsets" out of the once-per line splitter()
method to the __init__() method :-)

Cheers,
John




More information about the Python-list mailing list