[Tutor] os.urandom()
Dave Angel
davea at ieee.org
Mon Aug 9 15:51:34 CEST 2010
Steven D'Aprano wrote:
> On Mon, 9 Aug 2010 07:23:56 pm Dave Angel wrote:
>
>
>> Big difference between 2.x and 3.x. In 3.x, strings are Unicode, and
>> may be stored either in 16bit or 32bit form (Windows usually compiled
>> using the former, and Linux the latter).
>>
>
> That's an internal storage that you (generic you) the Python programmer
> doesn't see, except perhaps indirectly via memory consumption.
>
> Do you know how many bits are used to store floats? If you try:
> <snip>
You've missed including the context that I was responding to. I'm well
aware of many historical architectures, and have dealt with the
differences between the coding on an IBM 26 keypunch and an IBM 29. As
well as converting the 12 bit raw bits coming from a hollerith card into
various forms compatible with the six characters per word storage of the
CDC 6400.
I doubt however that Python could be ported to a machine with a 9 bit
byte or a 7 bit byte, and remain fully compatible.
The OP was talking about the display of \xhh and thought he had
discovered a discrepancy between the docs on 2.x and 3.x. And for that
purpose it is quite likely relevant that 3.x has characters that won't
fit in 8 bits, and thus be describable in two hex digits. I was trying
to point out that characters in 3.x are more than 16 bits, and thus
would require more than two hex digits. But a b'' string does not.
I don't usually use 3.1, but I was curious to discover that repr() won't
display a string with an arbitrary Unicode character in it.
++a = chr(300)
++repr(a)
File "c:\progfiles\python31\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u012c' in
position
2: character maps to <undefined>
I would have expected it to use something like:
"x\012c"
I realize that it can't produce a pair of bytes without a (non-ASCII)
decoding, but it doesn't make sense to me that repr() doesn't display
something reasonable, like hex. FWIW, my sys.stdout.encoding is cp437.
DaveA
More information about the Tutor
mailing list