
On March 3, 2013 2:20 AM, Carl Friedrich Bolz wrote:
Are you *sure* you are running on a 64 bit machine?
Sure? No. I assumed it's 64bit pypy because it was generating x86_64 instructions. How would you check for sure?
uname reports x86_64 on the machine I built pypy on.
$ pypy --version Python 2.7.3 (42c0d1650cf4, Feb 23 2013, 01:53:42) [PyPy 2.0.0-beta1 with GCC 4.6.3]
That doesn't show the machine size.
pypy --info is interesting but doesn't help either
?
When I run diz.py on a 64 bit machine, the BINARY_XOR bytecodes turn into int_xor low
level operations, as expected.
I would like to see what you see using jitviewer since it differs from what I pasted (and maybe to figure out why).
Anyway, to debug where low and high turn into Python longs, you can putthe following properties in arithmetic32.Encoder:
...
That was some clever code. I like it. :)
In 32 bit they obviously trigger because of the line self.low = (self.low << 1) & 0xffffffff
(0xffffffff is a Python long on 32 bit).
Specifically it's a long because 0xffffffff is unsigned 32 bit and thus can't fit in a signed 32bit. There no hope for the 32 bit version unless more int types are added beyond Python's two, specifically an uint32 type. And that's not python. Except, I did notice that numpy has this type and more:
https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/npy_comm...
It's conceivable that the efforts to bring numpy into pypy will be dealing with these various int sizes and could better support uint32 code like that in diz. However, I suspect it's not a huge win because the int operations will remain function calls instead of single x86 instructions. That's the real pain, on all platforms.
Anyways, I run the code and it works. Everywhere. And I've finally convinced myself to stop abusing the dict with millions of items, so I've got more stuff to do. :)
-Roger

import sys print sys.maxint
On Mon, Mar 4, 2013 at 10:42 AM, Roger Flores aidembb@yahoo.com wrote:
On March 3, 2013 2:20 AM, Carl Friedrich Bolz wrote:
Are you *sure* you are running on a 64 bit machine?
Sure? No. I assumed it's 64bit pypy because it was generating x86_64 instructions. How would you check for sure?
uname reports x86_64 on the machine I built pypy on.
$ pypy --version Python 2.7.3 (42c0d1650cf4, Feb 23 2013, 01:53:42) [PyPy 2.0.0-beta1 with GCC 4.6.3]
That doesn't show the machine size.
pypy --info is interesting but doesn't help either
?
When I run diz.py on a 64 bit machine, the BINARY_XOR bytecodes turn into int_xor low
level operations, as expected.
I would like to see what you see using jitviewer since it differs from what I pasted (and maybe to figure out why).
Anyway, to debug where low and high turn into Python longs, you can putthe following properties in arithmetic32.Encoder:
...
That was some clever code. I like it. :)
In 32 bit they obviously trigger because of the line self.low = (self.low << 1) & 0xffffffff
(0xffffffff is a Python long on 32 bit).
Specifically it's a long because 0xffffffff is unsigned 32 bit and thus can't fit in a signed 32bit. There no hope for the 32 bit version unless more int types are added beyond Python's two, specifically an uint32 type. And that's not python. Except, I did notice that numpy has this type and more:
https://github.com/numpy/numpy/blob/master/numpy/core/include/numpy/npy_comm...
It's conceivable that the efforts to bring numpy into pypy will be dealing with these various int sizes and could better support uint32 code like that in diz. However, I suspect it's not a huge win because the int operations will remain function calls instead of single x86 instructions. That's the real pain, on all platforms.
Anyways, I run the code and it works. Everywhere. And I've finally convinced myself to stop abusing the dict with millions of items, so I've got more stuff to do. :)
-Roger _______________________________________________ pypy-dev mailing list pypy-dev@python.org http://mail.python.org/mailman/listinfo/pypy-dev

On 03/04/2013 09:42 AM, Roger Flores wrote:
On March 3, 2013 2:20 AM, Carl Friedrich Bolz wrote:
Are you*sure* you are running on a 64 bit machine?
Sure? No. I assumed it's 64bit pypy because it was generating x86_64 instructions. How would you check for sure?
uname reports x86_64 on the machine I built pypy on.
$ pypy --version Python 2.7.3 (42c0d1650cf4, Feb 23 2013, 01:53:42) [PyPy 2.0.0-beta1 with GCC 4.6.3]
That doesn't show the machine size.
pypy --info is interesting but doesn't help either
just a wild guess: is it possible that you generated pyc files with a 32bit version of pypy and then imported it on a 64bit one?
For example, suppose you have this foo.py:
def foo(): return 2147483648 print type(foo())
if you import it on 32bit, it prints 'long' and generates a pyc file. If you then import 'foo' on 64bit, it still prints 'long', but if you remove the pyc and import again, it prints 'int'. (This happens because 2147483648 is stored as a long inside the marshalled pyc file).
ciao, Anto

Hi Anto,
On Mon, Mar 4, 2013 at 10:20 AM, Antonio Cuni anto.cuni@gmail.com wrote:
(This happens because 2147483648 is stored as a long inside the marshalled pyc file).
Riiiight, absolutely. How about deciding that this is a bug and fixing it in PyPy (even if it's the CPython behavior)? It can even be done while keeping the same .pyc file, if we make a simple extension to the .pyc file format for "integers without an L suffix that fit in 64-bit". (The .pyc files are anyway different from CPython's. Moreover we need to propagate this informtion from the compiler, which means it would only be used when marshalling code objects, never directly integers --- important for compatibility with CPython, when "marshal" is used directly.)
A bientôt,
Armin.

On 03/04/2013 10:20 AM Antonio Cuni wrote:
On 03/04/2013 09:42 AM, Roger Flores wrote:
On March 3, 2013 2:20 AM, Carl Friedrich Bolz wrote:
Are you*sure* you are running on a 64 bit machine?
Sure? No. I assumed it's 64bit pypy because it was generating x86_64 instructions. How would you check for sure?
uname reports x86_64 on the machine I built pypy on.
$ pypy --version Python 2.7.3 (42c0d1650cf4, Feb 23 2013, 01:53:42) [PyPy 2.0.0-beta1 with GCC 4.6.3]
That doesn't show the machine size.
pypy --info is interesting but doesn't help either
just a wild guess: is it possible that you generated pyc files with a 32bit version of pypy and then imported it on a 64bit one?
For example, suppose you have this foo.py:
def foo(): return 2147483648 print type(foo())
if you import it on 32bit, it prints 'long' and generates a pyc file. If you then import 'foo' on 64bit, it still prints 'long', but if you remove the pyc and import again, it prints 'int'. (This happens because 2147483648 is stored as a long inside the marshalled pyc file).
ISTM like such a cross-import should print a warning or something (maybe silent reimport?) even if you don't want to call it a bug.
<rant_warning> Something bothers me about calling alternate representations different "types." ISTM HLL types are (or should be) abstract, and concrete representation choices are implementation details.
I'm ok with "type" from the vocabulary of C, where it *does* (now that C99 supplies specific representation-sized typedefs without having to hack them ourselves ;-) map to choices of representation of different abstract types.
But -- I am not comfortable with type(2**31) printing two different things in the same *HL* language, unless it is an introspection into the implementation, in which case I think it ought not to be called "type" but instead maybe for example "rtype" (for representation-type),and let "type" be reserved for abstract types of the language. Thus we might get, e.g., (faked OTTOMH): >>> type(2**31) # on either 64 or 32 bit interpreter <type 'int'> >>> type(2**63) # on either 64 or 32 bit interpreter <type 'int'> and current behavior extended via "rtype" >>> rtype(2**31) # on 64 bit interpreter <rtype 'int64_t'> >>> rtype(2**31) # on 32 bit interpreter <rtype 'intBInnn'> # BI for BigInt of some design nnn >>> rtype(2**30) # on 32 bit interpreter <rtype 'int32_t'> >>> rtype(2**63) # on either 64 or 32 bit interpreter <rtype 'intBInnn'> # BI for BigInt, nnn to identify different versions if desired by the same token, arguably an abstract string is a sequence of abstract characters,and >>> type(''), type(u'') # should give you (<type 'string'>, <type 'string'>) # not (<type 'str'>, <type 'unicode'>) whereas rtype could reveal strings as being represented as byte sequences with or latin1 or( dingbat encoding, or utf8 or various other unicode or wchar_t unicode etc.
Might be interesting if rtype could track jitted and ffi representations too, and still have type say the right thing for type(thing). Sorry, got carried away ;-/ </rant_warning>
BTW, what happens if you import in the opposite order? Can the 32-bit interpreter process the 64-bit pyc?
Regards, Bengt Richter

Hi Bengt,
On Mon, Mar 4, 2013 at 4:32 PM, Bengt Richter bokr@oz.net wrote:
</rant_warning>
That's an issue of the Python language, which we won't address here. Note also that Python 3 goes a bit in the direction you describe.
BTW, what happens if you import in the opposite order? Can the 32-bit interpreter process the 64-bit pyc?
Yes: in this case the 64-bit Python (either CPython or PyPy) will store the 64-bit "int", and the 32-bit Python will go ``oups, a 64-bit int, I'm going to load it as a "long"''.
Actually the only thing we could change (assuming we want to fix it) would be, on 32-bit, to also generate such a 64-bit int for code objects that contain 64-bit Python constants without the "L" suffix.
A bientôt,
Armin.

On Mon, Mar 4, 2013 at 1:13 AM, Maciej Fijalkowski wrote:
print sys.maxint
print hex(sys.maxint)
0x7fffffffffffffff
That works. Not so obvious though.
On Monday, March 4, 2013 1:20 AM, Antonio Cuni wrote:
just a wild guess: is it possible that you generated pyc files with a 32bit version of pypy and then imported it on a 64bit one?
Ding! That's exactly what was afflicting me. I had no idea the pyc files would not get refreshed on the platform change.
After 'rm *pyc' I now see substantially better code! Importantly, the arithmetic function calls are reduced to the single x86 instructions that Carl Friedrich saw.
For the record, this is what I now see:
BINARY_XOR i39 = ((pypy.objspace.std.intobject.W_IntObject)p35).inst_intval [pure] mov r8,QWORD PTR [r10+0x8] i40 = ((pypy.objspace.std.intobject.W_IntObject)p37).inst_intval [pure] mov rdi,QWORD PTR [r13+0x8] i41 = int_xor(i39, i40) mov rcx,r8 xor r8,rdiLOAD_CONST 2147483648 BINARY_AND i43 = i41 & 2147483648 mov r11d,0x80000000 and r8,r11LOAD_CONST 0 COMPARE_OP == i45 = i43 == 0 cmp r8,0x0 jne 0x9468582b
It's strange the AND doesn't use the imm32 form like the COMPARE does. It's unfortunate the COMPARE doesn't suppress the cmp and reuse the zero flag from the prior instruction.
Looking at the self.low and self.high references prior to the XOR, I'm assuming the null guard is because their types are a PyInt_Type which can always have a None (Python semantics). But I don't understand why the work isn't moved out of the loop (or maybe it is and I just can't tell).
Thanks everyone, -Roger
participants (5)
-
Antonio Cuni
-
Armin Rigo
-
Bengt Richter
-
Maciej Fijalkowski
-
Roger Flores