[New-bugs-announce] [issue16311] Use _PyUnicodeWriter API in text decoders

STINNER Victor report at bugs.python.org
Wed Oct 24 20:38:21 CEST 2012


New submission from STINNER Victor:

Attached patch modifies text decoders to use the _PyUnicodeWriter API to factorize the code. It removes unicode_widen() and unicode_putchar() functions.

 * Don't overallocate by default  (except for "raw-unicode-escape" codec), enable overallocation on the first decode error (as done currently)
 * _PyUnicodeWriter_Prepare() only overallocates 25%, instead of 100%
for unicode_decode_call_errorhandler()
 * Use _PyUnicodeWriter_Prepare() + PyUnicode_WRITE() (two macros)
instead of unicode_putchar() (function)
 * _PyUnicodeWriter structures stores many useful fields, so we don't
have to pass multiple parameters to functions, only the writer

I wrote the patch to factorize the code, but it might be faster.

----------
files: codecs_writer.patch
keywords: patch
messages: 173695
nosy: haypo
priority: normal
severity: normal
status: open
title: Use _PyUnicodeWriter API in text decoders
type: performance
versions: Python 3.4
Added file: http://bugs.python.org/file27697/codecs_writer.patch

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue16311>
_______________________________________


More information about the New-bugs-announce mailing list