String changing size on failure?

MRAB python at mrabarnett.plus.com
Wed Nov 1 16:17:41 EDT 2017


On 2017-11-01 19:26, Ned Batchelder wrote:
>   From David Beazley (https://twitter.com/dabeaz/status/925787482515533830):
> 
>       >>> a = 'n'
>       >>> b = 'ñ'
>       >>> sys.getsizeof(a)
>      50
>       >>> sys.getsizeof(b)
>      74
>       >>> float(b)
>      Traceback (most recent call last):
>         File "<stdin>", line 1, in <module>
>      ValueError: could not convert string to float: 'ñ'
>       >>> sys.getsizeof(b)
>      77
> 
> Huh?
> 
It's all explained in PEP 393.

It's creating an additional representation (UTF-8 + zero-byte 
terminator) of the value and is caching that, so there'll then be the 
bytes for 'ñ' and the bytes for the UTF-8 (0xC3 0xB1 0x00).

When the string is ASCII, the bytes of the UTF-8 representation is 
identical to those or the original string, so it can share them.



More information about the Python-list mailing list