
On Tue, 30 Aug 2005 14:45:25 +0200 Dinu Gherman <gherman@darwin.in-berlin.de> wrote:
Being a fan of testing I'd like to suggest conducting some compara- tive tests between CPython and PyPy, as well. At least I find stuff like the following pretty "interesting". It's about using re for splitting strings at very large substrings of some minimum length (something I just used for processing AIFF audio files, the code here is slightly simpler):
Python 2.4 (#1, Feb 7 2005, 21:41:21) [GCC 3.3 20030304 (Apple Computer, Inc. build 1640)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> >>> import re >>> n = 'o' >>> l = int(1e5) >>> inp = "012" + n*l + "abc" + n*l + "xyz" >>> exp = ["012", "abc", "xyz"] >>> res = re.split(n+'{%d,%d}'%(l, l), inp) >>> exp == res False
Dinu, that scared me deeply! So I stopped everything and tryied it. Python 2.3.5 (#1, Aug 11 2005, 10:10:19) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import re n = 'o' l = int(1e5) inp = "012" + n*l + "abc" + n*l + "xyz" exp = ["012", "abc", "xyz"] res = re.split(n+'{%d,%d}'%(l, l), inp) exp == res False
Python 2.4.1 (#2, Mar 30 2005, 21:51:10) [GCC 3.3.5 (Debian 1:3.3.5-8ubuntu2)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import re n = 'o' l = int(1e5) inp = "012" + n*l + "abc" + n*l + "xyz" exp = ["012", "abc", "xyz"] res = re.split(n+'{%d,%d}'%(l, l), inp) exp == res True
So, this seems to be a bug fixed in CPython after 2.4
There could be workarounds for this particular case, but the point is that PyPy can be "correct" in places where CPython is not (here prob- ably because of limitations of the re machinery). And because they'd fail you would not expect to find such test cases in the "normal" test suites...
I'm a *big fan* of the pypy-team and pypy-itself. But I do not think __this particular case__ is fair enough to advertise PyPy getting it wright where CPython got it wrong. best regards, Rod Senra rsenra _at_ acm.org