Tokenize a string or split on steroids
Fernando RodrÃguez
frr at wanadoo.es
Sat Mar 9 12:10:14 EST 2002
On Sat, 09 Mar 2002 11:30:40 GMT, Bob Follek <b.follek at verizon.net> wrote:
>If you're unfamiliar with regular expressions, here's a good starting
>point: http://py-howto.sourceforge.net/regex/regex.html
Thanks. :-) BTW, the strings that must be tokenized contain other non
alphanumeric characters (parenthese, for example), so I tried another regex:
[{}].
The result, although usable, is sort of weird:
>>> s = "{one}{two}"
>>> x1 = re.compile('[{}]')
>>> x1.split(s)
['', 'one', '', 'two', '']
Where are those empty strings coming from??? :-?
I can filter() them out, but I wonder where they come from.... O:-)
TIA
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| Fernando RodrÃguez frr at EasyJob.NET
| http://www.EasyJob.NET/
| Expert resume and cover letter creation system.
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the Python-list
mailing list