[Patches] [ python-Patches-923643 ] long < -> byte-string conversion

SourceForge.net noreply at sourceforge.net
Wed Sep 15 11:27:58 CEST 2004


Patches item #923643, was opened at 2004-03-25 19:17
Message generated for change (Comment added) made by trevp
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=923643&group_id=5470

Category: Core (C code)
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Trevor Perrin (trevp)
Assigned to: Nobody/Anonymous (nobody)
Summary: long <-> byte-string conversion

Initial Comment:

Sometimes you want to turn a long into a byte-string, 
or vice-versa.  This is useful in cryptographic protocols, 
and probably other network protocols where you need 
to exchange large integers.

In 2.4, you can handle unsigned longs with this:

def stringToLong(s):
    return long(binascii.hexlify(s), 16)

def longToString(n):
    return binascii.unhexlify("%x" % n)

However, these functions are slower than they need to 
be, they're kinda kludgey, and they don't handle 
negative values.

So here's a proposal:

def stringToLong(s):
    return long(s, 256)

def longToString(n):
    return n.tostring()

These functions operate on big-endian, 2's-complement 
byte-strings.  If the value is positive but has its most-
significant-bit set, an extra zero-byte will be 
prepended.  This is the same way OpenSSL and (I think) 
GMP handle signed numbers.

These functions are ~5x faster than the earlier ones, 
they're cleaner, and they work with negative numbers.  

If you only want to deal with unsigned positive numbers, 
you'll have to do some adjustments:

def stringToLong(s):
    return long('\0'+s, 256)

def longToString(n):
    s = n.tostring()
    if s[0] == '\0' and s != '\0':
        s = s[1:]
    return s

That's not ideal, but it seems better than any interface 
change I could think of.

Anyways, the patch adds this to longs.  It should be 
added to ints too, and I guess it needs tests etc..  I 
can help with that, if the basic idea is acceptable.

Trevor

----------------------------------------------------------------------

>Comment By: Trevor Perrin (trevp)
Date: 2004-09-15 02:27

Message:
Logged In: YES 
user_id=973611


Uploading a new patch (base256.diff).  This implements only
the string-> long (or int) conversion.  It adds support for
radix 256 (unsigned) or -256 (2's-complement signed) to the
int() and long() built-ins:
  int("\xFF\xFF\xFF", 256) -> 0xFFFFFF
  int("\xFF\xFF\xFF", -256) -> -1
  long(os.urandom(128), 256) -> 1024-bit integer

I left out the long -> string conversion.  If python adds a
bytes() type, then that conversion could be done as
bytes(long).  This patch has docs and tests.

----------------------------------------------------------------------

Comment By: Josiah Carlson (josiahcarlson)
Date: 2004-03-29 23:10

Message:
Logged In: YES 
user_id=341410

I'm curious to know if anyone would object to optional
minimum or maximum or both arguments, or even some
additional methods that would result in a potentially
constrained string output from long.tostring()?

If I were to split the functionality into three methods,
they would be as follows...

def atleast(long, atl):
    if atl < 0:
        raise TypeError("atleast requires a positive integer
for a minimum length")
    a = long.tostring()
    la = len(a)
    return (atl-la)*'\o' + a

def atmost(long, atm):
    if atm < 0:
        raise TypeError("atleast requires a positive integer
for a minimum length")
    a = long.tostring()
    la = len(a)
    return a[:atm]

def constrained(long, atl, atm):
    if atm < atl:
        raise TypeError("constrained requires that the
maximum length be larger than the minimum length")
    if atl < 0 or atm < 0:
        raise TypeError("constrained requires that both
arguments are positive")
    a = long.tostring()
    la = len(a)
    return ((atl-la)*'\o' + a)[:atm]


I personally would find use for the above, would anyone else
have use for it?

----------------------------------------------------------------------

Comment By: Trevor Perrin (trevp)
Date: 2004-03-28 16:55

Message:
Logged In: YES 
user_id=973611

My last comment was wrong: GMP's raw input/output format 
uses big-endian positive values, with the sign bit stored 
separately.

----------------------------------------------------------------------

Comment By: Trevor Perrin (trevp)
Date: 2004-03-28 16:54

Message:
Logged In: YES 
user_id=973611

My last comment was wrong: GMP's raw input/output format 
uses big-endian positive values, with the sign bit stored 
separately.

----------------------------------------------------------------------

Comment By: Trevor Perrin (trevp)
Date: 2004-03-26 23:51

Message:
Logged In: YES 
user_id=973611

I think 2's complement makes good sense for arbitrary 
precision longs.  This is how OpenSSL and GMP handle them.  
It's also how the ASN.1 BER/DER encodings handle integers: 
these encodings just prepend tag and length fields to the big-
endian 2's complement value.  I.e.: If you want to extract 
RSA public values from an X.509 certificate, they'll be in 2's 
complement (well, they'll always be positive... but they'll 
have an extra zero byte if necessary).

Since the functionality for 2's complement is already in the C 
code it's easy to expose through a patch.  So I'm still in favor 
of presenting it.

----------------------------------------------------------------------

Comment By: paul rubin (phr)
Date: 2004-03-26 22:57

Message:
Logged In: YES 
user_id=72053

How about just punting signed conversion.  I don't think
two's complement makes much sense for arbitrary precision
longs.  Have some separate representation for negative longs
if needed.  If you call hex() on a large negative number,
you get a hex string with a leading minus sign.  For base
256, you can't reserve a char like that, so I guess you have
to just throw an error if someone tries to convert a
negative long to a string.  If you want a representation for
signed longs, ASN1 DER is probably an ok choice.  I agree
with Guido that the binascii module is a good place to put
such a function.  Twos complement can work if you specify a
fixed precision, but that sure complicates what this started
out as.

----------------------------------------------------------------------

Comment By: Trevor Perrin (trevp)
Date: 2004-03-26 22:45

Message:
Logged In: YES 
user_id=973611

You're right, we should support unsigned strings somehow.  
Adding another argument to the int() and long() constructors 
would be messy, though.  How about:

n = long(s, 256) #unsigned
n = long(s, -256) #signed

n.tounsignedstring()
n.tosignedstring()

The "-256" thing is a hack, I admit..  but it kinda grows on 
you, if you stare it at awhile :-)...






----------------------------------------------------------------------

Comment By: Trevor Perrin (trevp)
Date: 2004-03-26 22:45

Message:
Logged In: YES 
user_id=973611

You're right, we should support unsigned strings somehow.  
Adding another argument to the int() and long() constructors 
would be messy, though.  How about:

n = long(s, 256) #unsigned
n = long(s, -256) #signed

n.tounsignedstring()
n.tosignedstring()

The "-256" thing is a hack, I admit..  but it kinda grows on 
you, if you stare it at awhile :-)...






----------------------------------------------------------------------

Comment By: paul rubin (phr)
Date: 2004-03-25 19:53

Message:
Logged In: YES 
user_id=72053

I think those funcs should take an optional extra arg to say
you want unsigned.  That's cleaner than prepending '0'.  In
cryptography you usually do want unsigned.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=923643&group_id=5470


More information about the Patches mailing list