parsing tab and newline delimited text

alex23 wuwei23 at gmail.com
Wed Aug 4 02:34:19 EDT 2010


On Aug 4, 12:14 pm, elsa <kerensael... at hotmail.com> wrote:
> So, an individual entry might have this form (in printed form):
>
> Title    date   position   data
>
> with each field separated by tabs, and a newline at the end of data.

As James posted, the csv module is ideal for this sort of thing.
Dealing with delimited text seems obvious but, as with most things,
there are some edge cases that can bite you, so it's generally best to
use utility code that has already dealt with them.

If you're using Python 2.6+ you can use it in conjunction with
namedtuple for some very easy record retrieval:

>>> import csv
>>> from collections import namedtuple
>>> Record = namedtuple('Record', 'title date position data')
>>> tabReader = csv.reader(open('test.txt','rb'), delimiter='\t')
>>> for record in (Record(*row) for row in tabReader):
...    print record.title, record.data
...
title1 data1\t\n\n\t
title2 data2\t\t\t\t
title3 data3\n\n\n\n

Hope this helps.



More information about the Python-list mailing list