Regex help needed

rh0dium sklass at pointcircle.com
Tue Jan 10 10:59:38 EST 2006


Hi all,

I am using python to drive another tool using pexpect.  The values
which I get back I would like to automatically put into a list if there
is more than one return value. They provide me a way to see that the
data is in set by parenthesising it.

This is all generated as I said using pexpect - Here is how I use it..
     child = pexpect.spawn( _buildCadenceExe(), timeout=timeout)
     child.sendline("somefunction()")
     child.expect("> ")
     data=child.before

Given this data can take on several shapes:

Single return value -- THIS IS THE ONE I CAN'T GET TO WORK..
data = 'somefunction()\r\n"@(#)$CDS: icfb.exe version 5.1.0 05/22/2005
23:36 (cicln01) $"\r\n'

Multiple return value
data = 'somefunction()\r\n("." "~"
"/eda/ic_5.10.41.500.1.18/tools.lnx86/dfII/samples/techfile")\r\n'

It may take up several lines...
data = 'somefunction()\r\n("." "~"
\r\n"/eda/ic_5.10.41.500.1.18/tools.lnx86/dfII/samples/techfile"\r\n"foo")\r\n'

So if you're still reading this I want to parse out data.  Here are the
rules...
- Line 1 ALWAYS is the calling function whatever is there (except
"\r\n") should be kept as "original"
- Anything may occur inside the quotations - I don't care what's in
there per se but it must be maintained.
- Parenthesed items I want to be pushed into a list.  I haven't run
into a case where you have nested paren's but that not to say it won't
happen...

So here is my code..  Pardon my hack job..

import os,re

def main(data=None):

    # Get rid of the annoying \r's
    dat=data.split("\r")
    data="".join(dat)

    # Remove the first line - that is the original call
    dat = data.split("\n")
    original=dat[0]
    del dat[0]

    print "Original", original
    # Now join all of the remaining lines
    retl="".join(dat)

    # self.logger.debug("Original = \'%s\'" % original)

    try:
        # Get rid of the parenthesis
        parmatcher = re.compile( r'\(([^()]*)\)' )
        parmatch = parmatcher.search(retl)

        # Get rid of the first and last quotes
        qrmatcher = re.compile( r'\"([^()]*)\"' )
        qrmatch = qrmatcher.search(parmatch.group(1))

        # Split the items
        qmatch=re.compile(r'\"\s+\"')
        results = qmatch.split(qrmatch.group(1))
    except:
        qrmatcher = re.compile( r'\"([^()]*)\"' )
        qrmatch = qrmatcher.search(retl)

        # Split the items
        qmatch=re.compile(r'\"\s+\"')
        results = qmatch.split(qrmatch.group(1))

    print "Orig", original, "Results", results
    return original,results


# General run..
if __name__ == '__main__':


#     data = 'someFunction\r\n "test" "foo"\r\n'
#     data = 'someFunction\r\n "test  foo"\r\n'
    data = 'getVersion()\r\n"@(#)$CDS: icfb.exe version 5.1.0
05/22/2005 23:36 (cicln01) $"\r\n'
#     data = 'someFunction\r\n ("test" "test1" "foo aasdfasdf"\r\n
"newline" "test2")\r\n'

    main(data)

CAN SOMEONE PLEASE CLEAN THIS UP?




More information about the Python-list mailing list