[Python-Dev] cpython: PyUnicode_Join() checks output length in debug mode
Georg Brandl
g.brandl at gmx.net
Tue Oct 4 23:41:54 CEST 2011
On 10/03/11 23:35, victor.stinner wrote:
> http://hg.python.org/cpython/rev/bfd8b5d35f9c
> changeset: 72623:bfd8b5d35f9c
> user: Victor Stinner <victor.stinner at haypocalc.com>
> date: Mon Oct 03 23:36:02 2011 +0200
> summary:
> PyUnicode_Join() checks output length in debug mode
>
> PyUnicode_CopyCharacters() may copies less character than requested size, if
> the input string is smaller than the argument. (This is very unlikely, but who
> knows!?)
>
> Avoid also calling PyUnicode_CopyCharacters() if the string is empty.
>
> files:
> Objects/unicodeobject.c | 34 +++++++++++++++++++---------
> 1 files changed, 23 insertions(+), 11 deletions(-)
>
>
> diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
> --- a/Objects/unicodeobject.c
> +++ b/Objects/unicodeobject.c
> @@ -8890,20 +8890,32 @@
>
> /* Catenate everything. */
> for (i = 0, res_offset = 0; i < seqlen; ++i) {
> - Py_ssize_t itemlen;
> + Py_ssize_t itemlen, copied;
> item = items[i];
> + /* Copy item, and maybe the separator. */
> + if (i && seplen != 0) {
> + copied = PyUnicode_CopyCharacters(res, res_offset,
> + sep, 0, seplen);
> + if (copied < 0)
> + goto onError;
> +#ifdef Py_DEBUG
> + res_offset += copied;
> +#else
> + res_offset += seplen;
> +#endif
> + }
> itemlen = PyUnicode_GET_LENGTH(item);
> - /* Copy item, and maybe the separator. */
> - if (i) {
> - if (PyUnicode_CopyCharacters(res, res_offset,
> - sep, 0, seplen) < 0)
> + if (itemlen != 0) {
> + copied = PyUnicode_CopyCharacters(res, res_offset,
> + item, 0, itemlen);
> + if (copied < 0)
> goto onError;
> - res_offset += seplen;
> - }
> - if (PyUnicode_CopyCharacters(res, res_offset,
> - item, 0, itemlen) < 0)
> - goto onError;
> - res_offset += itemlen;
> +#ifdef Py_DEBUG
> + res_offset += copied;
> +#else
> + res_offset += itemlen;
> +#endif
> + }
> }
> assert(res_offset == PyUnicode_GET_LENGTH(res));
I don't understand this change. Why would you not always add "copied" once you
already have it? It seems to be the more correct version anyway.
Georg
More information about the Python-Dev
mailing list