Does Python allow access to some of the implementation details?

Claudio Grondi claudio.grondi at freenet.de
Fri Jan 6 20:49:46 EST 2006


Martin v. Löwis wrote:
> You can get somewhat faster in Python than your code if you avoid
> producing new long objects all the time, and do the task in chunks of 30
> bits.
It would be nice if you could explain why you consider chunks of 30 bits 
to be superior e.g. to chunks of 32 bits?

> write a C file that is a Python module
If I understand you right, there is no way to get direct access to 
internal representation of Python objects by Python script language 
means provided in the standard Python 2.4.2 distribution.
So the answer to the question in the subject is : NO (valid for what is 
  provided in the standard Python distribution, but not valid when 
considering to write an own extension module in C)

Below an updated version of the previously posted code with included 
code piece suggested by Martin v. Löwis.
The i.__hex__() method of bit extraction seem to be the best choice for 
large (enough) integers, superior also to processing in chunks of 30 
bits. Here an example of output resulting from run of below code on my 
current machine:

   Seconds for bit extraction from integer with 216 hex-digits when:

       looping over i>>1;i&0x01 = 0.001227
       ^-- but in 32 bit chunks = 0.000685
       looping over i>>1;i%2    = 0.001614
       using i.__hex__() repr.  = 0.000143

   Bits extraction from long integers with number of hexadecimal digits
       > 216
   is at least 1 ms faster with i.__hex__() then with i>>1;i&0x01 method.

   Modulo division (%2) is always slower than bitwise-and (&0x01).

It seems, that the __hex__() method wins on Pentium 4 3.0 GHz for any 
value of integer, where on Pentium 4 2.8 GHz beginning with a given 
value for long integers, but it could be, that this effect is caused by 
the time measurement method used and not actually by the effect of using 
another processor (sitting on identical motherboard).
This was the reason why I have decided to get at least 1 ms time 
difference between the methods before printing the results (see code 
below).

Claudio

import time

# dctLstOfBitsVsCharOfNibble is a dictionary with a key beeing one 
character
# string representing hexadecimal value of a nibble and a value beeing a 
list
# of bits of the nibble where the lowest bit is stored at index 0 of the 
list.
# i.e.
dctLstOfBitsVsCharOfNibble = {
   '0':[0, 0, 0, 0],
   '1':[0, 0, 0, 1],
   '2':[0, 0, 1, 0],
   '3':[0, 0, 1, 1],
   '4':[0, 1, 0, 0],
   '5':[0, 1, 0, 1],
   '6':[0, 1, 1, 0],
   '7':[0, 1, 1, 1],
   '8':[1, 0, 0, 0],
   '9':[1, 0, 0, 1],
   'A':[1, 0, 1, 0],
   'B':[1, 0, 1, 1],
   'C':[1, 1, 0, 0],
   'D':[1, 1, 0, 1],
   'E':[1, 1, 1, 0],
   'F':[1, 1, 1, 1]
}
# The dctLstOfBitsVsCharOfNibble dictionary can be generated by 
following code:
# dctLstOfBitsVsCharOfNibble = {}
# for intNibbleValue in range(0, 16):
#   lstBitReprOfCurrNibble=[]
#   for indxOfBit in range(0,4):
#     lstBitReprOfCurrNibble.append(intNibbleValue>>indxOfBit&0x01)
#   #:for
#   lstBitReprOfCurrNibble.reverse()
# 
dctLstOfBitsVsCharOfNibble['%X'%(intNibbleValue,)]=lstBitReprOfCurrNibble
# #:for
n = 0
NoOf32bitChunks   = 0
lstBitsBitwiseAnd = []
lstBitsModulo     = []
lstViaBitChunks   = []
lstBitsViaHex     = []
timeBitwiseAnd    = -1
timeModulo        = -1
timeBitsViaHex    = -1
timeViaBitChunks  = -1

bitChunkSize      = 32 # must be <= 32

while timeBitwiseAnd <= timeBitsViaHex + 0.001:

   n = (n<<32) + 0x12345678

   NoOf32bitChunks += 1

   i = n
   lstBitsModulo = []
   start = time.clock()
   while i:
     lstBitsModulo.append(i%2)
     i=i>>1
   timeModulo = time.clock()-start

   i = n
   lstBitsBitwiseAnd = []
   start = time.clock()
   while i:
     lstBitsBitwiseAnd.append(i&0x01)
     i=i>>1
   timeBitwiseAnd = time.clock()-start

   i = n
   lstViaBitChunks = []
   bitFilter    = 0
   for dummy in range(bitChunkSize):
     bitFilter = (bitFilter<<1)+1
   #:for

   start = time.clock()
   done = False
   while i:
     i1 = int(i & bitFilter)
     i >>= bitChunkSize
     if i == 0: done = True
     for k in xrange(bitChunkSize):
       lstViaBitChunks.append(i1 & 1)
       i1 >>= 1
       if done and i1==0:
         # if this is the top word, and no bits are left, we are done
         break
       #:if
     #:for
   #:while
   timeViaBitChunks = time.clock()-start

   i = n
   # lstBitsViaHex = []
   start = time.clock()
   strHexOf_i = i.__hex__()[2:]
   if strHexOf_i[-1]=='L':
     strHexOf_i=strHexOf_i[0:-1]
   #:if
   intNoOfLeadingZeroBits = 0
   lstBitsOfFirstNibble = dctLstOfBitsVsCharOfNibble[strHexOf_i[0]]
   while not lstBitsOfFirstNibble[intNoOfLeadingZeroBits] and 
intNoOfLeadingZeroBits < 4:
     intNoOfLeadingZeroBits += 1
   #:while
   if intNoOfLeadingZeroBits == 4:
     lstBitsViaHex = []
   else:
     lstBitsViaHex = lstBitsOfFirstNibble[intNoOfLeadingZeroBits:]
   #:if
   for indxToStrHexOf_i in range(1,len(strHexOf_i)):
     lstBitsViaHex += 
dctLstOfBitsVsCharOfNibble[strHexOf_i[indxToStrHexOf_i]]
   #:for
   lstBitsViaHex.reverse()
   timeBitsViaHex = time.clock()-start
#:while

if(   lstBitsBitwiseAnd == lstBitsModulo
   and lstBitsBitwiseAnd == lstBitsViaHex
   and lstBitsBitwiseAnd == lstViaBitChunks ):
   print
   print '  Seconds for bit extraction from integer with %i hex-digits 
when: '%(NoOf32bitChunks*8,)
   print
   print '      looping over i>>1;i&0x01 = %f '%(timeBitwiseAnd,)
   print '      ^-- but in %i bit chunks = %f '%(bitChunkSize, 
timeViaBitChunks)
   print '      looping over i>>1;i%%2    = %f '%(timeModulo,)
   print '      using i.__hex__() repr.  = %f '%(timeBitsViaHex,)
   print
   print '  Bits extraction from long integers with number of 
hexadecimal digits '
   print '      > %i '%(NoOf32bitChunks*8,)
   print '  is at least 1 ms faster with i.__hex__() then with 
i>>1;i&0x01 method.'
   print
   print '  Modulo division (%2) is always slower than bitwise-and (&0x01).'



More information about the Python-list mailing list