[Python-Dev] Better text processing support in py2k?

M.-A. Lemburg mal@lemburg.com
Mon, 03 Jan 2000 19:22:02 +0100


Tim Peters wrote:
> 
> >> This is why I do complex string processing in Icon <0.9 wink>.
> 
> [MAL]
> > You can have all that extra magic via callable tag objects
> > or callable matching functions. It's not exactly nice to
> > write, but I'm sure that a meta-language could do the
> > conversions for you.
> 
> That wasn't my point:  I do it in Icon because it *is* "exactly nice to
> write", and doesn't require any yet-another meta-language.  It's all
> straightforward, in a way that separate schemes pasted together can never be
> (simply because they *are* "separate schemes pasted together" <wink>).
>
> The point of my Python examples wasn't that they could do something
> mxTextTools can't do, but that they were *Python* examples:  every variation
> I mentioned (or that you're likely to think of) was easy to handle for any
> Python programmer because the "control flow" and "data type" etc aspects
> could be handled exactly the way they always are in *non* pattern-matching
> Python code too, rather than recoded in pattern-scheme-specific different
> ways (e.g., where I had a vanailla "if/break", you set up a special
> exception to tickle the matching engine).
> 
> I'm not attacking mxTextTools, so don't feel compelled to defend it --

Oh, I wasn't defending it -- I know that it is cryptic and sometimes
a pain to use. But given that you don't have to invoke a C compiler
to get a raw speed I find it a rather useful alternative to code
fast utility functions which would otherwise have to be written
in C.

The other reason it exists is simply because I don't like the
recursive style of regexps too much. mxTextTools is simple
and straightforward. Backtracking is still possible, but not
recommended.

> people using regexps in those examples are dead in the water.  mxTextTools
> is very good at what it does; if we have a real disagreement, it's probably
> that I'm less optimistic about the prospects for higher-level wrappers
> (e.g., MikeF's SimpleParse is much slower than "a real" BNF parsing system
> (ARBNFPS), in part because he isn't doing all the optimizations ARBNFPS
> does, but also in part because ARBNFPS uses an underlying engine more
> optimized to its specific task than mxTextTool's more-general engine *can*
> be).  So I don't see mxTextTools as being the answer to everything -- and if
> you hadn't written it, you would agree with that on first glance <wink>.

Oh, I'm sure it *is* the answer to all out problems ;-) ...

def main(*dummy):
    ...

from mx.TextTools import *
tag("",((main, Skip + CallTag, 0),))
 
> > Anyway, I'll keep focussing on the speed aspect of mxTextTools;
> > others can focus on abstractions, so that eventually everybody
> > will be happy :-)
> 
> You and I will be, anyway <wink>.

Happy New Year :-)
-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                             Happy New Century !
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/