Regex help needed
Michael Spencer
mahs at telcopartners.com
Tue Jan 10 20:03:39 EST 2006
rh0dium wrote:
> Michael Spencer wrote:
>> >>> def parse(source):
>> ... source = source.splitlines()
>> ... original, rest = source[0], "\n".join(source[1:])
>> ... return original, rest_eval(get_tokens(rest))
>
> This is a very clean and elegant way to separate them - Very nice!! I
> like this alot - I will definately use this in the future!!
>
>> Cheers
>>
>> Michael
>
On reflection, this simplifies further (to 9 lines), at least for the test cases
your provide, which don't involve any nested parens:
>>> import cStringIO, tokenize
...
>>> def get_tokens2(source):
... src = cStringIO.StringIO(source).readline
... src = tokenize.generate_tokens(src)
... return [token[1][1:-1] for token in src if token[0] == tokenize.STRING]
...
>>> def parse2(source):
... source = source.splitlines()
... original, rest = source[0], "\n".join(source[1:])
... return original, get_tokens2(rest)
...
>>>
This matches your main function for the three tests where main works...
>>> for source in sources[:3]: #matches your main function where it works
... assert parse2(source) == main(source)
...
Original someFunction
Orig someFunction Results ['test', 'foo']
Original someFunction
Orig someFunction Results ['test foo']
Original someFunction
Orig someFunction Results ['test', 'test1', 'foo aasdfasdf', 'newline', 'test2']
...and handles the case where main fails (I think correctly, although I'm not
entirely sure what your desired output is in this case:
>>> parse2(sources[3])
('getVersion()', ['@(#)$CDS: icfb.exe version 5.1.0 05/22/2005 23:36 (cicln01)
$'])
>>>
If you really do need nested parens, then you'd need the slightly longer version
I posted earlier
Cheers
Michael
More information about the Python-list
mailing list