[Tutor] Hex to Str - still an open issue

Sat Feb 5 16:48:01 CET 2005

Liam,

I think you misunderstand what endianness is.

Big-endian and little-endian refer to the way a number is stored as bytes in the underlying memory 
of the computer. This is not something you generally need to worry about in a Python program.

For example, consider the number 0x12345678. On most modern computers this will be stored in four 
consecutive bytes of computer memory. The individual bytes will contain the values 0x12, 0x34, 0x56, 
0x78. The question is, what is the order of those bytes in memory? On a big-endian computer, the 
most significant byte - 0x12 - is stored at the lowest memory address, so the sequence of bytes will 
be 0x12, 0x34, 0x56, 0x78. On a little-endian computer, the least-significant byte is stored at the 
lowest address, and the order will be reversed: 0x78, 0x56, 0x34, 0x12.

Most programming languages will hide this detail from you most of the time. Even in assembly 
language, you generally load and store integers without worrying about endianness. Math operations 
just do the right thing so you don't have to worry about it.

Endianness becomes an issue when you want to convert between representations, and when binary data 
is shared between computers which may have different endianness.

For example in a C program you might want to get the high byte of an integer when you know the 
address of the integer. The desired byte will be at (address+0) or (address+3) depending on the 
endianness of the hardware.

Similarly, if an array of integers is written to a file in a binary representation (not as ASCII 
strings representing the integers, but as 32-bit values), then to correctly read the file you have 
to know the endianness of the data in the file.

OK, so what does this have to do with converting a number to binary in Python? Well, nothing, 
actually. First, note that 'binary representation' can mean two different things. In the description 
above, I was talking about the actual bit pattern stored in the computer. Python works with binary 
numbers all the time, in this sense, but it is under the hood. The other meaning of 'binary 
representation' is that of a base-2 string representation of a number.

So if you ask, "How do I convert a number to binary?" you can mean either of these.

The first one is trivial. If you have a decimal string representation of the number, use int() to 
convert it to binary. If you have an integer already, it's already *in* binary, so you don't have to 
do anything!

So, "How do I convert a number to binary?", to be interesting, must mean "How do I convert an 
integer to a base-2 string representation?" And how do you do this? Well, you figured out one way 
using the mathematical properties of integers. These operations are independent of endianness, and 
so is the desired result.

The base-2 string representation of  the number (whose base-16 string representation is) 0x1234 is 
'0001001000110100'. The order of digits here is determined by our convention of writing the most 
significant digits on the left, not by the endianness of the underlying computer.

OK, this is long enough, I hope I have shed some light...
Kent

Liam Clarke wrote:
> Jacob - just for you, begin your agitation for the next release please ;)
> 
> binstring.py, as attached. 
> (also pasted up - http://www.rafb.net/paste/results/5feItM57.html)
> 
> Creating this, was just a brain teaser, but I was thinking 'what if I
> wanted to make this for the standard library.'
> 
> And so you can see, I had to include a flag for endianess. But that
> was really a cheap trick. If this was going into a standard library,
> I'd want to query the OS for endianess. As for the bits, once again,
> 32 bit is the norm, but 64 bit is here and spreading.
> 
> Also, should it display 11111111 as 255 or 256? Both are valid,
> depending on context.
> 
> Thirdly, if I can do it in 2 minutes, (well, the main part), then
> should they bother putting it in the standard library considering
> also,
> 
> - How often, really, are you going to need to present a decimal or hex
> as a binary string.
> 
> Lastly - this only does base 10 to base 2. Should I include a base 6
> to base 2, base 8 to base 2, base 10 to 6, 10 to 8, 8 to 6?
> 
> I wouldn't like to write for the standard library, because you can
> never please everyone.
> 
> But yeah, feel free to use the above, just keep my doc strings and comments.
> 
> Regards,
> 
> Liam Clarke
> 
> On Fri, 4 Feb 2005 23:30:19 -0500, Jacob S. <keridee at jayco.net> wrote:
> 
>>>The binary value is the same as the hex value.
>>>The binary representation is 000111110100, but
>>>unfortunately Python doesn't support binary in
>>>its string formatting(although it does in int()!
>>
>>Uh, question. Why not? It seems that all simple types should be included.
>>Since the computer stores it as binary, why shouldn't python be able to
>>display a
>>string of it in binary? That seems to be a short coming that should be added
>>to the
>>next release... IMHO of course.
>>Jacob Schmidt
>>
>>_______________________________________________
>>Tutor maillist  -  Tutor at python.org
>>http://mail.python.org/mailman/listinfo/tutor
>>
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> ######
> # binString.py
> # by Liam Clarke
> #(Let me know when it's included in the standard library ;-))
> ######
> 
> """Converts a integer base 10 to a string base 2"""
> 
> def binary(decimalInt, bigEndian = True, bits = 32, truncExcess = False):
>     """
> Integer to be converted is essential, Endianess is an optional flag;
> me being a Win32 user, Endianess is big by default, defaults to a 32-bit
> representation, most integers in Python being 32 bit. truncExcess will 
> strip place-holder zeros for succintness.
> 
> Oh, and it will represent 11111111 as 256, as I'm not sure whether you want
> to start counting for zero with this. It's a simple matter to change."""
>     tempList = ['0' for x in range(bits)]
>     
>     for bitPlace in range(bits, -1, -1):
>         if decimalInt - 2**bitPlace >= 0:
>             tempList[bitPlace] = '1'
>             decimalInt = decimalInt - 2**bitPlace
>     if bigEndian:
>         tempList.reverse()
>     
>     outPut = ''.join(tempList)
>     
>     if truncExcess:
>         if bigEndian:
>             outPut=outPut.lstrip('0')
>         else:
>             outPut=outPut.rstrip('0')
>     
>     return outPut
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor