[Python-Dev] builtin_id() returns negative numbers

Mon Feb 14 16:41:35 CET 2005

[Troels Walsted Hansen]
> The Python binding in libxml2 uses the following code for __repr__():
>
> class xmlNode(xmlCore):
>     def __init__(self, _obj=None):
>         self._o = None
>         xmlCore.__init__(self, _obj=_obj)
> 
>     def __repr__(self):
>         return "<xmlNode (%s) object at 0x%x>" % (self.name, id (self))
>
> With Python 2.3.4 I'm seeing warnings like the one below:
> <frozen module libxml2>:2357: FutureWarning: %u/%o/%x/%X of negative int
> will return a signed string in Python 2.4 and up
> 
> I believe this is caused by the memory address having the sign bit set,
> causing builtin_id() to return a negative integer.

Yes, that's right.

> I grepped around in the Python standard library and found a rather
> awkward work-around that seems to be slowly propagating to various
> module using the "'%x' % id(self)" idiom:

No, it's not propagating any more:  I see that none of these exist in 2.4:

> Lib/asyncore.py:
>         # On some systems (RH10) id() can be a negative number.
>         # work around this.
>         MAX = 2L*sys.maxint+1
>         return '<%s at %#x>' % (' '.join(status), id(self)&MAX)
> 
> $ grep -r 'can be a negative number' *
> Lib/asyncore.py:        # On some systems (RH10) id() can be a negative
> number.
> Lib/repr.py:            # On some systems (RH10) id() can be a negative
> number.
> Lib/tarfile.py:        # On some systems (RH10) id() can be a negative
> number.
> Lib/test/test_repr.py:        # On some systems (RH10) id() can be a
> negative number.
> Lib/xml/dom/minidom.py:        # On some systems (RH10) id() can be a
> negative number.
>
> There are many modules that do not have this work-around in Python 2.3.4.

Not sure, but it looks like this stuff was ripped out in 2.4 simply
because 2.4 no longer produces a FutureWarning in these cases.  That
doesn't address that the output changed, or that the output for a
negative id() produced by %x under 2.4 is probably surprising to most.

> Wouldn't it be more elegant to make builtin_id() return an unsigned
> long integer?

I think so.  This is the function ZODB 3.3 uses, BTW:

# Addresses can "look negative" on some boxes, some of the time.  If you
# feed a "negative address" to an %x format, Python 2.3 displays it as
# unsigned, but produces a FutureWarning, because Python 2.4 will display
# it as signed.  So when you want to prodce an address, use positive_id() to
# obtain it.
def positive_id(obj):
    """Return id(obj) as a non-negative integer."""

    result = id(obj)
    if result < 0:
        # This is a puzzle:  there's no way to know the natural width of
        # addresses on this box (in particular, there's no necessary
        # relation to sys.maxint).  Try 32 bits first (and on a 32-bit
        # box, adding 2**32 gives a positive number with the same hex
        # representation as the original result).
        result += 1L << 32
        if result < 0:
            # Undo that, and try 64 bits.
            result -= 1L << 32
            result += 1L << 64
            assert result >= 0 # else addresses are fatter than 64 bits
    return result

The gives a non-negative result regardless of Python version and
(almost) regardless of platform (the `assert` hasn't triggered on any
ZODB 3.3 platform yet).

> Is the performance impact too great?

For some app, somewhere, maybe.  It's a tradeoff.  The very widespread
practice of embedding %x output from id() favors getting rid of the
sign issue, IMO.

> A long integer is used on platforms where SIZEOF_VOID_P > SIZEOF_LONG
> (most 64 bit platforms?),

Win64 is probably the only major (meaning likely to be popular among
Python users) platform where sizeof(void*) > sizeof(long).

> so all Python code must be prepared to handle it already...

In theory <wink>.