Code that ought to run fast, but can't due to Python limitations.
ben+python at benfinney.id.au
Sun Jul 5 04:09:12 CEST 2009
John Nagle <nagle at animats.com> writes:
> A dictionary lookup (actually, several of them) for every input
> character is rather expensive. Tokenizers usually index into a table
> of character classes, then use the character class index in a switch
> This is an issue that comes up whenever you have to parse some
> formal structure, from XML/HTML to Pickle to JPEG images to program
> The temptation is to write tokenizers in C, but that's an admission
> of language design failure.
This sounds like a job for <URL:http://pyparsing.wikispaces.com/>
\ “Better not take a dog on the space shuttle, because if he |
`\ sticks his head out when you're coming home his face might burn |
_o__) up.” —Jack Handey |
More information about the Python-list