split encloser
Steven Taschuk
staschuk at telusplanet.net
Thu Apr 3 23:28:22 EST 2003
Quoth Chris:
> string.split() takes a delimiter and works fine as long as the
> delimiter isn't part of the data fields. But frequently they are.
> e.g. 'John Doe,135 South Main St.,#122, Springfield, Iowa' or
> ' so long goodbye see ya'
>
> Because the fields can contain the delimiter in some cases, an
> encloser is usually used (typically "") to handle those fields.
>
> The above strings would be written:
> 'John Doe,"135 South Main St., #122", Springfield, Iowa'
> and
> '"so long" goodbye "see ya"'
What if the field data contains double quotes?
> I don't understand regular expressions but I was wondering if anyone
> that did knew of a way to get re.split() to handle "enclosers" as used
> above.
Why use a regular expression? string.split can do the trick:
in_out = line.split('"')
fields = []
for i in range(len(in_out)):
if i % 2:
results.append(in_out[i])
else:
results.extend(in_out[i].split(','))
Or, more clearly:
fields = []
while line:
if line.startswith('"'):
endquote = line.index('"', 1)
field = line[1:endquote]
# assumed " is followed by ,
line = line[endquote+1:]
else:
field, line = line.split(',', 1)
fields.append(field)
--
Steven Taschuk staschuk at telusplanet.net
Receive them ignorant; dispatch them confused. (Weschler's Teaching Motto)
More information about the Python-list
mailing list