Looking for very simple general purpose tokenizer
Maarten van Reeuwijk
maarten at remove_this_ws.tn.tudelft.nl
Mon Jan 19 04:55:31 EST 2004
Hi group,
I need to parse various text files in python. I was wondering if there was a
general purpose tokenizer available. I know about split(), but this
(otherwise very handy method does not allow me to specify a list of
splitting characters, only one at the time and it removes my splitting
operators (OK for spaces and \n's but not for =, / etc. Furthermore I tried
tokenize but this specifically for Python and is way too heavy for me. I am
looking for something like this:
splitchars = [' ', '\n', '=', '/', ....]
tokenlist = tokenize(rawfile, splitchars)
Is there something like this available inside Python or did anyone already
make this? Thank you in advance
Maarten
--
===================================================================
Maarten van Reeuwijk Heat and Fluid Sciences
Phd student dept. of Multiscale Physics
www.ws.tn.tudelft.nl Delft University of Technology
More information about the Python-list
mailing list