Looking for very simple general purpose tokenizer

Maarten van Reeuwijk maarten at remove_this_ws.tn.tudelft.nl
Mon Jan 19 10:55:31 CET 2004

Hi group,

I need to parse various text files in python. I was wondering if there was a
general purpose tokenizer available. I know about split(), but this
(otherwise very handy method does not allow me to specify a list of
splitting characters, only one at the time and it removes my splitting
operators (OK for spaces and \n's but not for =, / etc. Furthermore I tried 
tokenize but this specifically for Python and is way too heavy for me. I am
looking for something like this:

splitchars = [' ', '\n', '=', '/', ....]
tokenlist = tokenize(rawfile, splitchars)

Is there something like this available inside Python or did anyone already
make this? Thank you in advance

Maarten van Reeuwijk                        Heat and Fluid Sciences
Phd student                             dept. of Multiscale Physics
www.ws.tn.tudelft.nl                 Delft University of Technology

More information about the Python-list mailing list