Does Python allow access to some of the implementation details?

mensanator at aol.com mensanator at aol.com
Sat Jan 7 01:20:34 EST 2006


Claudio Grondi wrote:
> Martin v. Löwis wrote:
> > You can get somewhat faster in Python than your code if you avoid
> > producing new long objects all the time, and do the task in chunks of 30
> > bits.
> It would be nice if you could explain why you consider chunks of 30 bits
> to be superior e.g. to chunks of 32 bits?
>
> > write a C file that is a Python module
> If I understand you right, there is no way to get direct access to
> internal representation of Python objects by Python script language
> means provided in the standard Python 2.4.2 distribution.
> So the answer to the question in the subject is : NO (valid for what is
>   provided in the standard Python distribution, but not valid when
> considering to write an own extension module in C)

No need to re-invent the wheel. That extension has already been
written. The GMPY module has a full suite of bit-functions:

digits(...)
    digits(x[,base]): returns Python string representing x in the
    given base (2 to 36, default 10 if omitted or 0); leading '-'
    present if x<0, but no leading '+' if x>=0. x must be an mpz,
    or else gets coerced into one.

getbit(...)
    getbit(x,n): returns 0 or 1, the bit-value of bit n of x;
    n must be an ordinary Python int, >=0; x is an mpz, or else
    gets coerced to one.

hamdist(...)
    hamdist(x,y): returns the Hamming distance (number of bit-positions
    where the bits differ) between x and y.  x and y must be mpz,
    or else get coerced to mpz.

lowbits(...)
    lowbits(x,n): returns the n lowest bits of x; n must be an
    ordinary Python int, >0; x must be an mpz, or else gets
    coerced to one.

numdigits(...)
    numdigits(x[,base]): returns length of string representing x in
    the given base (2 to 36, default 10 if omitted or 0); the value
    returned may sometimes be 1 more than necessary; no provision
    for any 'sign' characte, nor leading '0' or '0x' decoration,
    is made in the returned length.  x must be an mpz, or else gets
    coerced into one.

popcount(...)
    popcount(x): returns the number of 1-bits set in x; note that
    this is 'infinite' if x<0, and in that case, -1 is returned.
    x must be an mpz, or else gets coerced to one.

scan0(...)
    scan0(x, n=0): returns the bit-index of the first 0-bit of x (that
    is at least n); n must be an ordinary Python int, >=0.  If no more
    0-bits are in x at or above bit-index n (which can only happen for
    x<0, notionally extended with infinite 1-bits), None is returned.
    x must be an mpz, or else gets coerced to one.

scan1(...)
    scan1(x, n=0): returns the bit-index of the first 1-bit of x (that
    is at least n); n must be an ordinary Python int, >=0.  If no more
    1-bits are in x at or above bit-index n (which can only happen for
    x>=0, notionally extended with infinite 0-bits), None is returned.
    x must be an mpz, or else gets coerced to one.

setbit(...)
    setbit(x,n,v=1): returns a copy of the value of x, with bit n set
    to value v; n must be an ordinary Python int, >=0; v, 0 or !=0;
    x must be an mpz, or else gets coerced to one.

By using the scan1 function, I can run rings around the BitwiseAnd
(using the program from your first post, correcting the missing bit
problem):

BitwiseAnd = 0.438082
Modulo     = 2.576335
GMPY       = 0.109865

33170 bits BitwiseAnd  1-bits: 16590
33170 bits Modulo      1-bits: 16590
33170 bits GMPY        1-bits: 16590

For BitwiseAnd and Modulo the 1-bit counts were found by
summing the lists. For GMPY, I used the popcount function.

You can get Windows binaries for GMPY at

<http://home.comcast.net/~casevh/>


import time
import gmpy

# i = 2**25964951 - 1
i = 123456789**1234
lstBitsModulo = []
start = time.clock()
while i:
   lstBitsModulo.append(i%2)
   i=i>>1
speedModulo = time.clock()-start

# i = 2**25964951 - 1
i = 123456789**1234
lstBitsBitwiseAnd = []
start = time.clock()
while i:
   lstBitsBitwiseAnd.append(i&0x01)
   i=i>>1
speedBitwiseAnd = time.clock()-start

i = gmpy.mpz(123456789**1234)
lstBitsGMPY = []
start = time.clock()
f = gmpy.scan1(i,0)
if f==0:
    lstBitsGMPY.append(1)
    k = 1
    f = gmpy.scan1(i,1)
else:
    k = 0
while f:
   while k<f:
      lstBitsGMPY.append(0)
      k += 1
   lstBitsGMPY.append(1)
   k += 1
   f = gmpy.scan1(i,f+1)
speedGMPY = time.clock()-start


print
if lstBitsBitwiseAnd == lstBitsModulo :
   print 'BitwiseAnd = %f '%(speedBitwiseAnd,)
   print 'Modulo     = %f '%(speedModulo,)
   print 'GMPY       = %f '%(speedGMPY,)

print # both lists are lists of long integer values
print len(lstBitsBitwiseAnd),'bits BitwiseAnd
1-bits:',sum(lstBitsBitwiseAnd)
print len(lstBitsModulo),    'bits Modulo
1-bits:',sum(lstBitsModulo)
print len(lstBitsGMPY),      'bits GMPY
1-bits:',gmpy.popcount(i)




More information about the Python-list mailing list