[pypy-issue] Issue #2207: Incorrect slicing of strings in a tight loop (pypy/pypy)

Helder Eijs issues-reply at bitbucket.org
Sun Dec 13 14:45:07 EST 2015

New issue 2207: Incorrect slicing of strings in a tight loop

Helder Eijs:

With pypy 4.0 and 4.0.1, I get incorrect results for the following piece of sample code, at least on x86_64:


from binascii import hexlify
from Crypto.Cipher import AES

cipher = AES.new(b'0'*16, AES.MODE_ECB)

result = b'0'*16
for x in xrange(10000):
    tmp = cipher.encrypt(result[:16])
    assert len(tmp) == 16
    result = tmp[8:] + tmp[:8]

print "Result before slicing:", hexlify(tmp)
print "Result after slicing :", hexlify(result)


Where the Crypto package is provided by [pycryptodome](https://pypi.python.org/pypi/pycryptodome/) (a pycrypto fork, v3.3.1) and the iteration count is sufficiently high (the problem does not show up with lower counts).

The expected result (which I obtain with CPython 2.x and 3.x and PyPy 2.6.1) is:

Result before slicing: dce8c9ca76d5bd2b82ac0d53d7c7a1c7
Result after slicing : 82ac0d53d7c7a1c7dce8c9ca76d5bd2b


However, with PyPy 4.0 and 4.0.1 I surprisingly get garbage that varies per iteration, like:

Result before slicing: c7ad819348c586d8d03fe18e0dc928b2
Result after slicing : 48000000000000000000000000000000

Other values are possible. Pycryptodome makes heavy use of cffi internally but the variable **tmp** is a simple byte string (created with the idiomatic ```ffi.buffer(xxx)[:]```).

Small tweaks of the code above seem to fix the problem for some unknown reason.

For instance, by replacing:

    tmp = cipher.encrypt(result[:16])
with the functionally equivalent:
    tmp = cipher.encrypt(result)
I get the right result.

More information about the pypy-issue mailing list