tim wrote:
OTOH, arbitrary small integers are not Pythonic. Your example *generates* them in order to guarantee they're unique, which is a bad sign.
this feature itself has been on the todo list for quite a while; the (?P#n) syntax just exposes the inner workings (the "small integer" is simply some- thing that fits in a SRE_CODE word). as you say, it's probably a good idea to hide it a bit better...
for phrase, action in lexicon: p.append("(?:%s)(?P#%d)" % (phrase, len(p)))
How about instead enhancing existing (?P<name>pattern) notation, to set a new match object attribute to name if & when pattern matches? Then arbitrary info associated with a named pattern can be gotten at via dicts via the pattern name, & the whole mess should be more readable.
good idea. and fairly easy to implement, I think. on the other hand, that means creating more real groups. and groups don't come for free... maybe this functionality should only be available through the scanner class? it can compile the patterns separately, and combine the data structures before passing them to the code generator. a little bit more code to write, but less visible oddities.
On the third hand, I'm really loathe to add more gimmicks to stinking regexps. But, on the fourth hand, no alternative yet has proven popular enough to move away from those suckers.
if-you-can't-get-a-new-car-at-least-tune-up-the-old-one-ly y'rs - tim
hey, SRE is a new car. same old technology, though. only smaller ;-) btw, if someone wants to play with this, I just checked in a new SRE snapshot. a little bit of documentation can be found here: http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954 </F>