Split with python

bearophileHUGS at lycos.com bearophileHUGS at lycos.com
Tue Aug 29 20:09:32 EDT 2006


Norman Khine:
> I have a csv file which is has a field that has something like:
> "text (xxx)"
> "text (text) (yyy)"
> "text (text) (text) (zzz)"
>
> I would like to split the last '(text)' out and put it in a new column,
> so that I get:
> "text","(xxx)"
> "text (text)","(yyy)"
> "text (text) (text)","(zzz)"


Maybe something like this can be useful, after few improvements (RE
formatting is a work in progress):

from StringIO import StringIO
import re

datain = StringIO("""
 "text (xxx)"
"text (text) (yy y) "
"text (text)  (text)  ( zzz ) "
""")


lastone = re.compile("""
                        \s* (  \(
                                  [^()"]*
                               \)
                               \s* "
                            )
                        \s* $
                     """, re.VERBOSE)

def repl(mobj):
    txt_found = mobj.groups()[0]
    return '", "' + txt_found

for line in datain:
    line2 = line.strip()
    if line2:
        print lastone.sub(repl, line2)

"""
The output is:

"text", "(xxx)"
"text (text)", "(yy y) "
"text (text)  (text)", "( zzz ) "

"""


Bye,
bearophile




More information about the Python-list mailing list