Lazy string with re.match?

Magnus Lie Hetland mlh at furu.idi.ntnu.no
Mon Apr 7 18:19:42 EDT 2003


I'm using regular expressions in a semi-recursive-descent parser, and
I'd like to read the data in a lazy manner. However, since I don't
know how much of the data a given regexp will need to find out whether
it matches or not, I thought about writing a lazy string class, and to
pass that to the match method of the various regexps. I first tried
simply implementing __getitem__, but was told in no uncertain terms
that a string or buffer was expected by the re object. So... I tried
to subclass str, but that didn't help -- it seemed to simply ignore my
__getitem__ method (not surprisingly, perhaps).

Does anyone have any info on how (if at all) I might trick a regexp to
use my __getitem__ method, or something similar?

The goal is, I suppose, to either let the regexp pump the lazy string
on its own, or to check whether it exhausted the given string material
(which may be a completely ordinary string) and, in that case, give it
some more until either it matches or does _not_ exhaust the string...

Any ideas?

(Since I'm not working with huge files at the moment, simply reading
in the entire thing as a single string is probably the easiest and
best solution, though...)

-- 
Magnus Lie Hetland               "Nothing shocks me. I'm a scientist." 
http://hetland.org                                   -- Indiana Jones




More information about the Python-list mailing list