[Tutor] parsing a string
Kent Johnson
kent_johnson at skillsoft.com
Sun Oct 10 20:40:39 CEST 2004
You can do this with the CSV module or with regular expressions:
import csv, re
# Data in a list so csv.reader can iterate over it
data = [ '1 1997 2 "Henrik Larsson"' ]
r = csv.reader(data, delimiter=' ')
for row in r:
print row # prints ['1', '1997', '2', 'Henrik Larsson']
# Regular expression to match three groups of digits separated by
whitespace, then whatever is between the quotes
lineRe = re.compile(r'(\d+)\s+(\d+)\s+(\d+)\s+"(.*)"')
match = lineRe.search(data[0])
print match.group(1, 2, 3, 4) # prints ('1', '1997', '2', 'Henrik Larsson')
The csv version might be handier if the data is in a file or file-like
object, because it expects to iterate over the input. Also if the quotes
are optional it will work just fine.
The regex version might be better if you get the strings one at a time. If
the quotes are optional you should change the regex to something like this:
r'(\d+)\s+(\d+)\s+(\d+)\s+"?(.*)"?
Kent
At 08:00 PM 10/10/2004 +0200, L&L wrote:
>Hi All,
>
>Suppose I have a string that looks like this:
>
>1 1997 2 "Henrik Larsson"
>
>I want to convert the string to a list, with four members. Is there an
>easy way to do this (the hard way would be to find all quotes, save to a
>separate string the area between the quotes, remove this part from the
>original string, use string.split, and put the string back together.
>
>Thanks.
>
>
>_______________________________________________
>Tutor maillist - Tutor at python.org
>http://mail.python.org/mailman/listinfo/tutor
More information about the Tutor
mailing list