Flexible string representation, unicode, typography, ...
MRAB
python at mrabarnett.plus.com
Thu Aug 23 11:11:05 EDT 2012
On 23/08/2012 14:57, Neil Hodgson wrote:
> wxjmfauth at gmail.com:
>
>> Small illustration. Take an a4 page containing 50 lines of 80 ascii
>> characters, add a single 'EM DASH' or an 'BULLET' (code points> 0x2000),
>> and you will see all the optimization efforts destroyed.
>>
>>>> sys.getsizeof('a' * 80 * 50)
>> 4025
>>>>> sys.getsizeof('a' * 80 * 50 + '•')
>> 8040
>
> This example is still benefiting from shrinking the number of bytes
> in half over using 32 bits per character as was the case with Python 3.2:
>
> >>> sys.getsizeof('a' * 80 * 50)
> 16032
> >>> sys.getsizeof('a' * 80 * 50 + '•')
> 16036
> >>>
>
Perhaps the solution should've been to just switch between 2/4 bytes
instead
of 1/2/4 bytes. :-)
More information about the Python-list
mailing list