String performance regression from python 3.2 to 3.3
tjreedy at udel.edu
Thu Mar 14 03:35:44 CET 2013
On 3/13/2013 7:43 PM, Chris Angelico wrote:
> On Thu, Mar 14, 2013 at 3:49 AM, rusi <rustompmody at gmail.com> wrote:
>> This assumes that there are only three choices:
>> - narrow build that is buggy (surrogate pairs for astral characters)
>> - wide build that is 4-fold space inefficient for wide variety of
>> common (ASCII) use-cases
>> - flexible string engine that chooses a small tradeoff of space
>> efficiency over time efficiency.
Wrong. Python almost certainly runs faster with the new string
representation. This has been explained previously more than once.
>> There is a fourth choice: narrow build that chooses to be partial over
>> being buggy. ie when an astral character is encountered, an exception
>> is thrown rather than trying to fudge it into a 16-bit
This is what tcl/tk does, and it is a dammed nuisance. Completely
unacceptible for Python's string type.
> It's complexity cost, though, and people would need to know when it
> would be worth giving Python that switch to change its string format.
> Plus, every C extension would need to cope with both formats. I
> personally doubt it'd be worth it, but if you want to knock together a
> patched CPython and get some timing stats, I'm sure this list or
> python-dev will be happy to discuss the matter. :)
I presume the smiley indicates that you know that python developers are
too busy with real problems to have any interest in bogus solutions to
Terry Jan Reedy
More information about the Python-list