28 Oct
2020
28 Oct
'20
12:20 a.m.
Thanks for your very informative reply. I replied you in issue41486. Maybe memory blocks will not bring performance improvement to _PyBytesWriter/_PyUnicodeWriter, which is a bit frustrating.
For a+b, Python first computes "a", then "b", and finally "a+b". I don't see how your API could optimize such code.
I mean this situation: s = 'a' * 100_000_000 + '\uABCD' b = s.encode('utf-8') b.encode('utf-8') # <- this situation I realize I was wrong, the UCS1->UCS2 transformation will only be done once, it only saves a memcpy(). Even in this case it will only save two memcpy(): s = 'a' * 100_000_000 + '\uABCD' * 100_000_000 + '\U00012345' b = s.encode('utf-8') b.encode('utf-8')