String Splitter Brain Teaser

Steven Bethard steven.bethard at gmail.com
Mon Mar 28 13:52:10 EST 2005


Michael Spencer wrote:
> def xgen(s):
>     srciter = iter(s)
>     item = [srciter.next()]
>     for i in srciter:
>         if i == '/':
>             item.append(srciter.next())
>         else:
>             yield item
>             item = [i]
>     yield item

Note that the generator-based solution doesn't generate an error on some 
invalid data (e.g. where there is a final '/'), where the previous 
list-based solution did:

py> group("AGC/C/TGA/T")
[['A'], ['G'], ['C', 'C', 'T'], ['G'], ['A', 'T']]
py> group("AGC/C/TGA/T/")
Traceback (most recent call last):
   File "<interactive input>", line 1, in ?
   File "<interactive input>", line 6, in group
StopIteration
py> list(xgen("AGC/C/TGA/T"))
[['A'], ['G'], ['C', 'C', 'T'], ['G'], ['A', 'T']]
py> list(xgen("AGC/C/TGA/T/"))
[['A'], ['G'], ['C', 'C', 'T'], ['G']]

Not sure which is the desired behavior, but I figured the OP should be 
aware of this in case it's possible to have strings in an invalid 
format.  If this needs to be fixed, you can just wrap the srciter.next() 
call in an appropriate try/except.

STeVe



More information about the Python-list mailing list