[New-bugs-announce] [issue25353] Use _PyBytesWriter for unicode escape and raw unicode escape encoders

STINNER Victor report at bugs.python.org
Fri Oct 9 14:13:42 CEST 2015


New submission from STINNER Victor:

Attached patch modifies unicode escape and raw unicode escape encoders to use the new _PyBytesWriter API.

The patch is optimized to encode Latin1 characters: encoding Latin1 characters when no character is escaped should not have to call _PyByte_Resize() at all.

When characters are escaped or a BMP or non-BMP string is encoded, overallocation is used to reduce the number of _PyByte_Resize(). It uses _PyBytesWriter overallocation strategy instead of always overallocate for the worst case.

_PyBytesWriter also embeds a small buffer allocated on the stack which also avoids calls to _PyBytes_Resize() when the output fits into 512 bytes.

----------
files: unicode_escape.patch
keywords: patch
messages: 252599
nosy: haypo, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Use _PyBytesWriter for unicode escape and raw unicode escape encoders
type: performance
versions: Python 3.6
Added file: http://bugs.python.org/file40727/unicode_escape.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25353>
_______________________________________


More information about the New-bugs-announce mailing list