Looking for very simple general purpose tokenizer
Maarten van Reeuwijk
maarten at remove_this_ws.tn.tudelft.nl
Tue Jan 20 07:43:02 EST 2004
I found a complication with the shlex module. When I execute the following
fragment you'll notice that doubles are split. Is there any way to avoid
numbers this?
source = """
$NAMRUN
Lz = 0.15
nu = 1.08E-6
"""
import shlex
import StringIO
buf = StringIO.StringIO(source)
toker = shlex.shlex(buf)
toker.comments = ""
toker.whitespace = " \t\r"
print [tok for tok in toker]
Output:
['\n', '$', 'NAMRUN', '\n', 'Lz', '=', '0', '.', '15', '\n', 'nu', '=', '1',
'.', '08E', '-', '6', '\n']
--
===================================================================
Maarten van Reeuwijk Heat and Fluid Sciences
Phd student dept. of Multiscale Physics
www.ws.tn.tudelft.nl Delft University of Technology
More information about the Python-list
mailing list