String changing size on failure?
Ned Batchelder
ned at nedbatchelder.com
Wed Nov 1 16:34:20 EDT 2017
On 11/1/17 4:17 PM, MRAB wrote:
> On 2017-11-01 19:26, Ned Batchelder wrote:
>> From David Beazley
>> (https://twitter.com/dabeaz/status/925787482515533830):
>>
>> >>> a = 'n'
>> >>> b = 'ñ'
>> >>> sys.getsizeof(a)
>> 50
>> >>> sys.getsizeof(b)
>> 74
>> >>> float(b)
>> Traceback (most recent call last):
>> File "<stdin>", line 1, in <module>
>> ValueError: could not convert string to float: 'ñ'
>> >>> sys.getsizeof(b)
>> 77
>>
>> Huh?
>>
> It's all explained in PEP 393.
>
> It's creating an additional representation (UTF-8 + zero-byte
> terminator) of the value and is caching that, so there'll then be the
> bytes for 'ñ' and the bytes for the UTF-8 (0xC3 0xB1 0x00).
>
> When the string is ASCII, the bytes of the UTF-8 representation is
> identical to those or the original string, so it can share them.
That explains why b is larger than a to begin with, but it doesn't
explain why float(b) is changing the size of b.
--Ned.
More information about the Python-list
mailing list