Mailman 3 Modifying the PyUnicode_FromUnicode result - Python-Dev - python.org

newer
Iterators, generators and 2.2 (was...

Modifying the PyUnicode_FromUnicode result

older
Re: Cygwin Python Distribution...

Martin v. Loewis

21 Apr 2001 21 Apr '01

2:51 a.m.

Currently, a number of routines assume that the result of PyUnicode_FromUnicode can be modified, i.e. they mutate the resulting unicode object. Look at unicodeobject.c:fixup for an example, and assume that fixfct is fixtitle (*). This is different from PyString_FromStringAndSize, whose result can be only modified if the str argument is NULL. These routines broke after I applied my caching patch, since now PyUnicode_FromUnicode may return an existing string. Is that difference intentional? My feeling is that it is an error to modify a unicode object, unless it is known not to be initialized. Regards, Martin P.S. This was actually the first failure case when running test_unicodedata under my patch.

Reply

Sign in to reply online Use email software

Show replies by thread

M.-A. Lemburg

21 Apr 21 Apr

3:43 a.m.

"Martin v. Loewis" wrote:

Currently, a number of routines assume that the result of PyUnicode_FromUnicode can be modified, i.e. they mutate the resulting unicode object. Look at unicodeobject.c:fixup for an example, and assume that fixfct is fixtitle (*).

This is true for the APIs in unicodeobject.c: as long as the newly created object hasn't "left" the Unicode implementation, the APIs in there are free to modify the otherwise immutable object.

This is different from PyString_FromStringAndSize, whose result can be only modified if the str argument is NULL.

These routines broke after I applied my caching patch, since now PyUnicode_FromUnicode may return an existing string.

Is that difference intentional? My feeling is that it is an error to modify a unicode object, unless it is known not to be initialized.

It is an error, but only for code outside the implementation, i.e. programs using the public API may only modify the contents when calling PyUnicode_FromUnicode() with NULL as u argument. Sorry for not remembering this when reviewing your patch on SF. -- Marc-Andre Lemburg ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Pages: http://www.lemburg.com/python/

Reply

Sign in to reply online Use email software

Martin v. Loewis

5:07 a.m.

This is true for the APIs in unicodeobject.c: as long as the newly created object hasn't "left" the Unicode implementation, the APIs in there are free to modify the otherwise immutable object.

That means that PyUnicode_FromUnicode does give a guarantee to return a fresh object, right? Then I cannot understand why it only gives this guarantee to callers inside unicodeobject.c, and not to other callers... Regards, Martin

Reply

Sign in to reply online Use email software

M.-A. Lemburg

6:15 a.m.

"Martin v. Loewis" wrote:

...
This is true for the APIs in unicodeobject.c: as long as the newly created object hasn't "left" the Unicode implementation, the APIs in there are free to modify the otherwise immutable object.

That means that PyUnicode_FromUnicode does give a guarantee to return a fresh object, right?

Let's put it this way: the internals in unicodeobject.c are allowed to modify the contents of the object even if it was prefilled with data that came from an initializer. External caller are not allowed to do this though unless u is set to NULL (just like in the corresponding string call).

Then I cannot understand why it only gives this guarantee to callers inside unicodeobject.c, and not to other callers...

Because I want to reserve the right to change the semantics *inside* unicodeobject.c at some later point. Note that currently no caching of Unicode objects takes place, but this could change in the future and indeed your patch starts into this direction. -- Marc-Andre Lemburg ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Pages: http://www.lemburg.com/python/

Reply

Sign in to reply online Use email software

Martin v. Loewis

6:25 a.m.

Because I want to reserve the right to change the semantics *inside* unicodeobject.c at some later point. Note that currently no caching of Unicode objects takes place, but this could change in the future and indeed your patch starts into this direction.

So would you accept a patch that corrects all calls to PyUnicode_FromUnicode which modify the result they get, without having passed a NULL str argument? Regards, Martin

Reply

Sign in to reply online Use email software

M.-A. Lemburg

6:37 a.m.

"Martin v. Loewis" wrote:

...
Because I want to reserve the right to change the semantics *inside* unicodeobject.c at some later point. Note that currently no caching of Unicode objects takes place, but this could change in the future and indeed your patch starts into this direction.

So would you accept a patch that corrects all calls to PyUnicode_FromUnicode which modify the result they get, without having passed a NULL str argument?

Yes :) -- Marc-Andre Lemburg ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Pages: http://www.lemburg.com/python/

Reply

Sign in to reply online Use email software

8405

Age (days ago)

8405

Last active (days ago)

Download

5 comments

2 participants

tags

participants (2)

M.-A. Lemburg
Martin v. Loewis