# unsigned 32 bit arithmetic type?

Robin Becker robin at reportlab.com
Wed Oct 25 14:05:56 CEST 2006

```Martin v. Löwis wrote:
> Robin Becker schrieb:
>> Hi, just trying to avoid wheel reinvention. I have need of an unsigned
>> 32 bit arithmetic type to carry out a checksum operation and wondered if
>>
>> Our current code works with 32 bit cpu's, but is failing with 64 bit
>> comparisons; it's clearly wrong as we are comparing a number with a
>> negated number; the bits might drop off in 32 bits, but not in 64.
>
> Not sure what operations you are doing: In Python, bits never drop off
>
> If you need to drop bits, you need to do so explicitly, by using the
> bit mask operations. I could tell you more if you'd tell us what
> the specific operations are.

This code is in a contribution to the reportlab toolkit that handles TTF fonts.
The fonts contain checksums computed using 32bit arithmetic. The original
Cdefintion is as follows

> ULONG CalcTableChecksum(ULONG *Table, ULONG Length)
> {
> ULONG Sum = 0L;
> ULONG *Endptr = Table+((Length+3) & ~3) / sizeof(ULONG);
>
> while (Table < EndPtr)
> 	Sum += *Table++;
> return Sum;
> }

so effectively we're doing only additions and letting bits roll off the end.

Of course the actual semantics is dependent on what C unsigned arithmetic does
so we're relying on that being the same everywhere.

This algorithm was pretty simple in Python until 2.3 when shifts over the end of
ints started going wrong. For some reason we didn't do the obvious and just do
everything in longs and just mask off the upper bits. For some reason (probably
my fault) we seem to have accumulated code like

def _L2U32(L):
'''convert a long to u32'''
return unpack('l',pack('L',L))

if sys.hexversion>=0x02030000:
"Calculate (x + y) modulo 2**32"
return _L2U32((long(x)+y) & 0xffffffffL)
else:
"Calculate (x + y) modulo 2**32"
lo = (x & 0xFFFF) + (y & 0xFFFF)
hi = (x >> 16) + (y >> 16) + (lo >> 16)
return (hi << 16) | (lo & 0xFFFF)

def calcChecksum(data):
"""Calculates TTF-style checksums"""
if len(data)&3: data = data + (4-(len(data)&3))*"\0"
sum = 0
for n in unpack(">%dl" % (len(data)>>2), data):
return sum

and also silly stuff like

def testChecksum(self):
"Test calcChecksum function"
self.assertEquals(calcChecksum(""), 0)
self.assertEquals(calcChecksum("\1"), 0x01000000)
self.assertEquals(calcChecksum("\x01\x02\x03\x04\x10\x20\x30\x40"), 0x11223344)
self.assertEquals(calcChecksum("\x81"), _L2U32(0x81000000L))
_L2U32(0x80000000L))

where while it might be reasonable to do testing it seems the tests aren't very
sensible eg what is -6 doing in a u32 test? This stuff just about works on a 32
bit machine, but is failing miserably on a 64bit AMD. As far as I can see I just
need to use masked longs throughout.

In a C extension I can still do the computation exfficiently on a 32bit machine,
but I need to do masking for a 64 bit machine.
--
Robin Becker

```