Stackless & String-processing

Tim Peters tim_one at email.msn.com
Wed Jul 21 23:29:12 EDT 1999


[Tim]
> ...
> If it isn't clear yet <wink>, efficiency isn't driving Graham
> here.  He's found a cool new approach, and is having fun with it.

[Graham Matthews]
> No that's not the case.

OK, my mistake:  you didn't find it, it's not cool, it's not new, and/or
you're not having fun <wink>.

> Efficiency drives all my programs *when it's a necessity*.

Ack, I'm speaking English here.  If efficiency isn't your *primary* concern,
then efficiency isn't "driving" you.  Dictionary.  Look up.  Efficiency is
Fernando's primary concern wrt parsing approaches -- it drives him.  That's
how you two differ.  It's important to note that efficiency really is a
necessity in his applications (have you read any of his papers?  no claim
makes sense out of context).

> My point is that using combinators:
>
> a) is a very nice way of structuring a parser since it is very amenable
>    to change and robust under change.

Could well be.

> b) doesn't have to cost anything in efficiency since you can freely mix
>    combinators with other parsing techniques.

Like programming in Python doesn't have to cost anything in efficiency since
you can freely mix Python with other languages?  It's a true enough
argument, but useless in practice unless 5% of the code accounts for 95% of
the runtime.  Which it often *does* in general-purpose Python apps.  For
parsing, it's not clear what kind of sequencing you could want that isn't
already covered by the "other parsing techniques" that are presumably doing
95% of the (presumed to be) efficiency-critical computation.  Have something
concrete in mind?

> c) is a general programming technique useful in a vast range of programs
>    outside of just parsing. For example I have used it for type checking
>    and writing a GUI. The literature has a lot of other examples of this
>    approach.

I didn't see anyone argue against that.  This thread started with string
processing, and everyone else has stuck to that.

> Moreover the point of combinators is not to "interweave the grammatical
> structure with other computations". Combinators do do that, and I think
> it's a great benefit.

Expect that depends on the app:  sometimes a benefit, sometimes not.  I've
used systems of both kinds, and neither is a pure win.

> But the point is to:
>
> a) clearly separate sequencing code from other code. In a combinator
>  	 program the precise control flow of the program is clear.

In a way deeper than it is in a functional program not using monads?  IOW,
is the clarity of "control flow" due to monads specifically or just to
functional semantics in general?  AFAIK, "combinator" hasn't sprouted a new
meaning beyond its old "higher-order curried function" one.

> b) allow programs to program their own sequencing combinators (rather
>    than providing some fixed set that the language designer hopes will
>    be enough.

WRT parsing specifically, what kinds of desirable sequencing do you find
lacking in other approaches?  Any parsing framework caters more-than-less
directly to juxtaposition, choice, optionality, repetition and (at least
conceptual) backtracking.  I understand you can model all those with
combinators.  What I haven't seen is an example of something clearly useful
(wrt parsing specifically) *beyond* those.

> ...
> I have been using combinators for some time now and never noticed the
> "grammar" being cluttered with all these "semantic annotations".

We often see most clearly that which we hate <wink>.  I saw 'em all over the
place in the papers.

> What annotations one requires can easily be separated from the grammar
> either by sensible layout, or by the design of clever combinators that
> hide semantic annotations away.

I mentioned indirectly that what drives me is variously "speed or
maintainability" in parsing systems.  "clever" is at odds with the latter --
unless all the cleverness comes built in to the system once & for all.

> ...
> Combinators are very robust under change. I have a parser written using
> combinators and I often tweak the grammar, without requiring large
> changes to other code or the parser itself.

That's encouraging!

> Have you ever written a combinator based parser or are you just
> "theorising"?

You ask that immediately before quoting my

> : I haven't tried using parser combinators, but in reading the papers ...

?  Prophetic <wink>.

> Ah there's the rub -- you haven't written a combinator based parser so you
> your comments are just what you *think* might happen if you did.

Oh yes!  Parsing with combinators is a Porsche.  I've driven all kinds of
cars for thirty years, in various terrains and in all kinds of weather; even
built a few engines from metal I smolt myself <wink>, that other people
relied on.  I used functional languages extensively too, although very
lightly this decade.  So when I read the Porsche owner's manual it's not
exactly blind imagination driving my concerns.  It's certainly no substitute
for a test drive, though.  I installed Clean with the aim of doing that, but
had too much fun poking at other aspects of the system to get around to it.

> I don't know where you got the idea that speed is a problem!

*Can* be a problem.  Fernando covered that exhaustively, and you actually
haven't disgreed with his points for all your words to the contrary <wink>.
I also have much other experience crafting elegant functional solutions that
had to be tossed in practice for the same kinds of "can't automatically
optimize a huge nest of clever higher-order applications" reasons.

> For some kinds of parsing I wouldn't use combinators rather I would use
> automaton techniques. But for most kinds of parsing they are plenty
> fast enough.

"Most" depends on what you're doing.  For most things I do for fun, even
SNOBOL4 was plenty fast enough (and I don't doubt for an instant that even a
casually tossed-together combinator-based program is faster than that!).
Half the things I do for work deal with artificial languages that are
specifically designed for one-token lookahead techniques, because they're
designed for peak speed over megabytes of input; the other half is parsing
huge ambiguous grammars on cheap PCs via dynamic programming techniques, and
there we *claw* to save bytes and milliseconds, no holds barred.

Different strokes.  I believe I'd enjoy combinator approaches for my own
fun; I still doubt they'd be practical for my work (but I'd rather try that
than continue arguing about it <0.9 wink>).

> ...
> Each new example in these papers requires a different combinator because
? each new example illustrates some new sequencing idea -- eg.
determinisation,
> coninuation, ...
> The papers are illustrating the generality of the techniques, and hence
> you expect to see a lot of different combinators.

Yes, and I confess to over-panicking at what I indeed should have expected.

> In practice however (something you wouldn't know about right?) you don't
> use all these combinators.

Which casts some doubt on the great utility of being able to create new
ones.  Parsing is an old problem.  If only a handful of combinators are of
actual use in parsing problems, freeze the semantics, package them in a
library, and implement a pleasant parser generator on top of them <wink>.

> I have written a parser for a new programming language and I used 10
> combinators (5 of which I re-used in the type checker!).
> Yes you do have to build your own combinators (that's the point of the
> approach!), but once done you stick them in a library and re-use them.

That's fine for a one-man show.  "Maintainability" in my life cuts across
programmers, and it's Not Good for each person to reinvent their own basic
machinery -- even to have the ability to do so.  One reason I like Python is
that it *doesn't* let the jerk down the hall invent their own control
structures <0.5 wink>.

I will give parser combinators a try; they do sound interesting to me.  More
interesting would be to hear about that "new programming language"!

i-believe-you-can-parse-it<wink>-ly y'rs  - tim






More information about the Python-list mailing list