I've just submitted to SF #757624 the third implementation of the SRE changes. Unlike the other implementations, this one doesn't recurse at all, and doesn't change the meaning of the opcodes in the engine. Indeed, the original logic is mostly unchanged, and the recursion has been removed by a code reorganization. Most of the magic is done using "jumps", and "context saving" inside SRE_MATCH. It's even possible to reenable the recursive scheme by enabling the USE_RECURSION macro. Besides the regression tests, I have tested this using Fredrik's suggestion, with a 4MB XML file and xmllib/tokenize modules. There was a very small slow down (i.e. about 37.4s in old implementation vs. 38.0s in new implementation). Bugs previously mentioned are also fixed in this implementation. This implementation also includes the protection against zero-width match in greedy expressions ("(?=a)*", "a"). Even if this implementation doesn't get into 2.3, I was thinking about backporting this specific fix to 2.3. Unfortunately, this would need an additional local variable in SRE_MATCH of the current implementation, and I'm afraid to reduce the recursion limit even more by introducing it. Additionally, after a quick look over perl's regex engine, I also have some ideas for matching optimization in SRE, but I'll wait a little bit before fiddling with it. -- Gustavo Niemeyer http://niemeyer.net
I have reasons (unfortunately under NDA -- I hope I'll be able to talk about it soone) for wanting to have a very stable 2.3 out by August 1st. Experiments with _sre are incompatible with this goal. I'm trying to get resources to release 2.3b2 on June 30. Who can help? --Guido van Rossum (home page: http://www.python.org/~guido/)
On vrijdag, jun 27, 2003, at 16:55 Europe/Amsterdam, Tim Peters wrote:
[Guido]
... I'm trying to get resources to release 2.3b2 on June 30. Who can help?
Barry and I can work on it Sunday (June 29), but not today or Monday. So we should shoot for the 29th.
I can help tonight and tomorrownight (i.e. saturday and sunday, MET). I might be able to put in some time on monday, but this is unsure. I will start with bug 762147, and (if Barry or someone else from Python Labs can give me the okay for a location somewhere on www.python.org) 762150. -- - Jack Jansen <Jack.Jansen@oratrix.com> http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -
On Sat, 2003-06-28 at 16:25, Jack Jansen wrote:
I will start with bug 762147, and (if Barry or someone else from Python Labs can give me the okay for a location somewhere on www.python.org) 762150.
Jack, did you see my pvt reply on this from last night? Once again, I have a gig tonight (Cravin' Dogs CD release party <shameless plug> :), but I'll be available most of the day tomorrow. I can't do the python.org twiddling until then. I suggest that folks coordinating on the Python release meet up on irc.freenode.net #python-dev. I haven't talked to Tim yet, but expect a cvs freeze tomorrow -- let's say by 12:00 noon EDT. -Barry
[Guido]
... I'm trying to get resources to release 2.3b2 on June 30. Who can help?
Barry and I can work on it Sunday (June 29), but not today or Monday. So we should shoot for the 29th.
OK, let's do it Sunday the 29th. --Guido van Rossum (home page: http://www.python.org/~guido/)
participants (5)
-
Barry Warsaw
-
Guido van Rossum
-
Gustavo Niemeyer
-
Jack Jansen
-
Tim Peters