String changing size on failure?
MRAB
python at mrabarnett.plus.com
Wed Nov 1 16:17:41 EDT 2017
On 2017-11-01 19:26, Ned Batchelder wrote:
> From David Beazley (https://twitter.com/dabeaz/status/925787482515533830):
>
> >>> a = 'n'
> >>> b = 'ñ'
> >>> sys.getsizeof(a)
> 50
> >>> sys.getsizeof(b)
> 74
> >>> float(b)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ValueError: could not convert string to float: 'ñ'
> >>> sys.getsizeof(b)
> 77
>
> Huh?
>
It's all explained in PEP 393.
It's creating an additional representation (UTF-8 + zero-byte
terminator) of the value and is caching that, so there'll then be the
bytes for 'ñ' and the bytes for the UTF-8 (0xC3 0xB1 0x00).
When the string is ASCII, the bytes of the UTF-8 representation is
identical to those or the original string, so it can share them.
More information about the Python-list
mailing list