print "%X" % id(object()) not so nice

I think id() should never be returning a negative number. Both these behaviors are poor: In 2.3:
In 2.4:
print "%X" %id(o) -5FC84D08
Pointers are conventionally never treated or printed as signed. In 2.3 and before, it usually ended up okay, besides the warning, because "%X" had broken behavior. In 2.4, now it's ending up doing the wrong thing and printing a confusing value. I propose that id() always return a positive value. This means that it will sometimes have to return a long instead of an int, but, it already does that under some circumstances on some architectures. Comments? James

"James Y Knight" <foom@fuhm.net> wrote in message news:582E4A36-3A6C-11D9-AA57-000A95A50FB2@fuhm.net...
1. CPython intentionally searches builtins afters globals and pre-imports the former as __builtins__ just so one can wrap a builtin to modify its apparent behavior for personal needs.
2. Given that, why bother changing the language for what must be an esoteric need (to formattedly print and view ids)? The id of an object is constant and unique with respect to contemporaneous objects but, for CPython, definitely not with respect to objects with non-overlapping lifetimes. (Newbies often get tripped by the last fact.).
For convenience, CPython uses the apparent address stored as an int. But this is strictly an implementation detail. On modern systems, that 'address' is, I believe, a process-specific virtual address which the hardware memory management system maps the hidden real address -- which is the only reason why systems with less than 2**31 memory can have addresses at or above 2**31 to become negative ints. Terry J. Reedy

On Nov 20, 2004, at 2:38 AM, Terry Reedy wrote:
The problem, more than anything else, is the following behavior that can happen during a random __repr__ or repr-like-function if the object happens to have a certain address range: - (Python 2.3) You get an unexpected and unwanted warning but expected output anyway - (Python 2.4) You get a repr with a strange looking negative hex number (0x-FF0102) Neither of these are fatal, of course, it's just annoying.. I find the Python 2.3 behavior more obnoxious than Python 2.4's, personally. FYI, I have also encountered this "problem" this week on a Powerbook G4 w/ only 1GB physical memory on both Python 2.3 and 2.4. I'm at the PyPy sprint, and a lot of the tools we are using make use of repr. Fortunately we have control over all of this code, so I checked in a workaround that makes sure a sane value was passed to the hex formatter: import sys HUGEINT = (sys.maxint + 1L) * 2L def uid(obj): """ Return the id of an object as an unsigned number so that its hex representation makes sense """ rval = id(obj) if rval < 0: rval += HUGEINT return rval -bob

"Bob Ippolito" <bob@redivi.com> wrote in message news:D3128502-3A8E-11D9-925A-000A9567635C@redivi.com...
Non-CS users probably find *all* hex numbers a little strange looking. If CPython were to simply print ids as decimal integers, instead of being fancy with hex 'addresses' there would have been no warnings and no change ;-). Is the absolute hex value ever of any use? If so, how often? Terry J. Reedy

"Bob Ippolito" <bob@redivi.com> wrote in message news:894CD7B9-
On Nov 20, 2004, at 5:03 AM, Terry Reedy wrote:
Is the absolute hex value ever of any use? If so, how often?
It makes it quite easy to match pdb output with gdb output! :)
Ah, the missing use case, which you and the OP probably took for granted. I, on the other hand, having never used either, find the difference in printed ids in
at least mildly disturbing. Do you only need to do such matching for complex objects that get the <type name at 0x########> representation? Terry J. Reedy

"Terry Reedy" <tjreedy@udel.edu> writes:
This hardly seems worth discussing :) It's a pointer. Pointers are printed in hex. It's Just The Way It Is. I don't know why. Actually, the "0x00868158" above is produced by C's %p format operator. So, in fact, ANSI C is probably why it is The Way It Is. Cheers, mwh -- Remember - if all you have is an axe, every problem looks like hours of fun. -- Frossie -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

[Terry Reedy]
[Michael Hudson]
This hardly seems worth discussing :)
Then it's a topic for me <wink>!
repr starts with %p, but %p is ill-defined, so Python goes on to ensure the result starts with "0x". C doesn't even say that %p produces hex digits, but all C systems we know of do(*), so Python doesn't try to force that part. As to "why hex?", it's for low-level debugging. For example, stack, register and memory dumps for binary machines almost always come in some power-of-2 base, usually hex, and searching for a stored address is much easier if it's shown in the same base. OTOH, id(Q) promises to return an integer that won't be the same as the id() of any other object over Q's lifetime. CPython returns Q's memory address, but CPython never moves objects in memory, so CPython can get away with returning the address. Jython does something very different for id(), because it must -- the Java VM may move an object in memory. Python doesn't promise to return a postive integer for id(), although it may have been nicer if it did. It's dangerous to change that now, because some code does depend on the "32 bit-ness as a signed integer" accident of CPython's id() implementation on 32-bit machines. For example, code using struct.pack(), or code using one of ZODB's specialized int-key BTree types with id's as keys. Speaking of which, current ZODB has a positive_id() function, used to format id()'s in strings where a sign bit would get in the way. (*) The %p in some C's for early x86 systems, using "segment + offset" mode, stuck a colon "in the middle" of the pointer output, to visually separate the segment from the offset. The two parts were still shown in hex, though.

"James Y Knight" <foom@fuhm.net> wrote in message news:582E4A36-3A6C-11D9-AA57-000A95A50FB2@fuhm.net...
1. CPython intentionally searches builtins afters globals and pre-imports the former as __builtins__ just so one can wrap a builtin to modify its apparent behavior for personal needs.
2. Given that, why bother changing the language for what must be an esoteric need (to formattedly print and view ids)? The id of an object is constant and unique with respect to contemporaneous objects but, for CPython, definitely not with respect to objects with non-overlapping lifetimes. (Newbies often get tripped by the last fact.).
For convenience, CPython uses the apparent address stored as an int. But this is strictly an implementation detail. On modern systems, that 'address' is, I believe, a process-specific virtual address which the hardware memory management system maps the hidden real address -- which is the only reason why systems with less than 2**31 memory can have addresses at or above 2**31 to become negative ints. Terry J. Reedy

On Nov 20, 2004, at 2:38 AM, Terry Reedy wrote:
The problem, more than anything else, is the following behavior that can happen during a random __repr__ or repr-like-function if the object happens to have a certain address range: - (Python 2.3) You get an unexpected and unwanted warning but expected output anyway - (Python 2.4) You get a repr with a strange looking negative hex number (0x-FF0102) Neither of these are fatal, of course, it's just annoying.. I find the Python 2.3 behavior more obnoxious than Python 2.4's, personally. FYI, I have also encountered this "problem" this week on a Powerbook G4 w/ only 1GB physical memory on both Python 2.3 and 2.4. I'm at the PyPy sprint, and a lot of the tools we are using make use of repr. Fortunately we have control over all of this code, so I checked in a workaround that makes sure a sane value was passed to the hex formatter: import sys HUGEINT = (sys.maxint + 1L) * 2L def uid(obj): """ Return the id of an object as an unsigned number so that its hex representation makes sense """ rval = id(obj) if rval < 0: rval += HUGEINT return rval -bob

"Bob Ippolito" <bob@redivi.com> wrote in message news:D3128502-3A8E-11D9-925A-000A9567635C@redivi.com...
Non-CS users probably find *all* hex numbers a little strange looking. If CPython were to simply print ids as decimal integers, instead of being fancy with hex 'addresses' there would have been no warnings and no change ;-). Is the absolute hex value ever of any use? If so, how often? Terry J. Reedy

"Bob Ippolito" <bob@redivi.com> wrote in message news:894CD7B9-
On Nov 20, 2004, at 5:03 AM, Terry Reedy wrote:
Is the absolute hex value ever of any use? If so, how often?
It makes it quite easy to match pdb output with gdb output! :)
Ah, the missing use case, which you and the OP probably took for granted. I, on the other hand, having never used either, find the difference in printed ids in
at least mildly disturbing. Do you only need to do such matching for complex objects that get the <type name at 0x########> representation? Terry J. Reedy

"Terry Reedy" <tjreedy@udel.edu> writes:
This hardly seems worth discussing :) It's a pointer. Pointers are printed in hex. It's Just The Way It Is. I don't know why. Actually, the "0x00868158" above is produced by C's %p format operator. So, in fact, ANSI C is probably why it is The Way It Is. Cheers, mwh -- Remember - if all you have is an axe, every problem looks like hours of fun. -- Frossie -- http://home.xnet.com/~raven/Sysadmin/ASR.Quotes.html

[Terry Reedy]
[Michael Hudson]
This hardly seems worth discussing :)
Then it's a topic for me <wink>!
repr starts with %p, but %p is ill-defined, so Python goes on to ensure the result starts with "0x". C doesn't even say that %p produces hex digits, but all C systems we know of do(*), so Python doesn't try to force that part. As to "why hex?", it's for low-level debugging. For example, stack, register and memory dumps for binary machines almost always come in some power-of-2 base, usually hex, and searching for a stored address is much easier if it's shown in the same base. OTOH, id(Q) promises to return an integer that won't be the same as the id() of any other object over Q's lifetime. CPython returns Q's memory address, but CPython never moves objects in memory, so CPython can get away with returning the address. Jython does something very different for id(), because it must -- the Java VM may move an object in memory. Python doesn't promise to return a postive integer for id(), although it may have been nicer if it did. It's dangerous to change that now, because some code does depend on the "32 bit-ness as a signed integer" accident of CPython's id() implementation on 32-bit machines. For example, code using struct.pack(), or code using one of ZODB's specialized int-key BTree types with id's as keys. Speaking of which, current ZODB has a positive_id() function, used to format id()'s in strings where a sign bit would get in the way. (*) The %p in some C's for early x86 systems, using "segment + offset" mode, stuck a colon "in the middle" of the pointer output, to visually separate the segment from the offset. The two parts were still shown in hex, though.
participants (7)
-
"Martin v. Löwis"
-
Bob Ippolito
-
James Y Knight
-
Michael Hudson
-
Scott David Daniels
-
Terry Reedy
-
Tim Peters