How to make regexes faster? (Python v. OmniMark)

Thu Apr 18 21:24:40 EDT 2002

I was recently introduced to OmniMark. One of our exercises was to take
a plain text file of Hamlet and convert it to SGML.

So I did it in Python, too. But the best time I could get from Python
was .57 sec, while OmniMark came in at .20 sec. What's the most
efficient technique for Pythonesque regex-based text processing?

My best time came from using a single rather large regex and findall; I
also tried smaller regexes and scan and match.

Has anyone else compared OmniMark and Python?

Thanks, all.

Fred