some next steps (was: Re: [pypy-dev] Release)

Rodrigo Dias Arruda Senra rodsenra at gpr.com.br
Tue Aug 30 15:04:29 CEST 2005


On Tue, 30 Aug 2005 14:45:25 +0200
Dinu Gherman <gherman at darwin.in-berlin.de> wrote:

> Being a fan of testing I'd like to suggest conducting some compara-
> tive tests between CPython and PyPy, as well. At least I find stuff
> like the following pretty "interesting". It's about using re for
> splitting strings at very large substrings of some minimum length
> (something I just used for processing AIFF audio files, the code
> here is slightly simpler):
> 
>      Python 2.4 (#1, Feb  7 2005, 21:41:21)
>      [GCC 3.3 20030304 (Apple Computer, Inc. build 1640)] on darwin
>      Type "help", "copyright", "credits" or "license" for more 
> information.
>      >>>
>      >>> import re
>      >>> n = 'o'
>      >>> l = int(1e5)
>      >>> inp = "012" + n*l + "abc" + n*l + "xyz"
>      >>> exp = ["012", "abc", "xyz"]
>      >>> res = re.split(n+'{%d,%d}'%(l, l), inp)
>      >>> exp == res
>      False

Dinu, that scared me deeply! So I stopped everything and tryied it.

Python 2.3.5 (#1, Aug 11 2005, 10:10:19)
[GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> n = 'o'
>>> l = int(1e5)
>>> inp = "012" + n*l + "abc" + n*l + "xyz"
>>> exp = ["012", "abc", "xyz"]
>>> res = re.split(n+'{%d,%d}'%(l, l), inp)
>>> exp == res
False

Python 2.4.1 (#2, Mar 30 2005, 21:51:10)
[GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> n = 'o'
>>> l = int(1e5)
>>> inp = "012" + n*l + "abc" + n*l + "xyz"
>>> exp = ["012", "abc", "xyz"]
>>> res = re.split(n+'{%d,%d}'%(l, l), inp)
>>> exp == res
True

So, this seems to be a bug fixed in CPython after 2.4

> There could be workarounds for this particular case, but the point is
> that PyPy can be "correct" in places where CPython is not (here prob-
> ably because of limitations of the re machinery). And because they'd
> fail you would not expect to find such test cases in the "normal"
> test suites...

I'm a *big fan* of the pypy-team and pypy-itself. But I do not think
__this particular case__ is fair enough to advertise PyPy getting it wright
where CPython got it wrong. 

best regards,
Rod Senra
rsenra _at_ acm.org
 



More information about the Pypy-dev mailing list