Mailman 3 Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage() - Python-Dev

newer
Omission in re.sub?

Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

older
[PATCH] Adding braces to __future__

Antoine Pitrou

9 Dec 2011 9 Dec '11

6:05 a.m.

On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner <python-checkins@python.org> wrote:

...

+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

Show replies by date

"Martin v. Löwis"

9 Dec 9 Dec

2:14 p.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

Am 09.12.2011 01:35, schrieb Antoine Pitrou:

...

On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner <python-checkins@python.org> wrote:

...
+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

It can convert a unicode subtype object into a an exact unicode object. I'd rename it to _PyUnicode_AsExactUnicode, and undocument it. Regards, Martin

Nick Coghlan

2:42 p.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

On Fri, Dec 9, 2011 at 6:44 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:

...

Am 09.12.2011 01:35, schrieb Antoine Pitrou:

...
On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner <python-checkins@python.org> wrote:

...
+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

It can convert a unicode subtype object into a an exact unicode object.

I'd rename it to _PyUnicode_AsExactUnicode, and undocument it.

Isn't it basically just exposing a C level version of the unicode() builtin's behaviour? While I agree the name could be better (and PyUnicode_AsExactUnicode would certainly work), why make it private? Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Barry Warsaw

9:12 p.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

On Dec 09, 2011, at 07:12 PM, Nick Coghlan wrote:

...

Isn't it basically just exposing a C level version of the unicode() builtin's behaviour? While I agree the name could be better (and PyUnicode_AsExactUnicode would certainly work), why make it private?

Don't we already have that in PyObject_Str(), or in Python 2, PyObject_Unicode()? -Barry

"Martin v. Löwis"

12 Dec 12 Dec

3:44 a.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

Am 09.12.2011 10:12, schrieb Nick Coghlan:

...

On Fri, Dec 9, 2011 at 6:44 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:

...
Am 09.12.2011 01:35, schrieb Antoine Pitrou:

...
On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner <python-checkins@python.org> wrote:

...
+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

It can convert a unicode subtype object into a an exact unicode object.

I'd rename it to _PyUnicode_AsExactUnicode, and undocument it.

Isn't it basically just exposing a C level version of the unicode() builtin's behaviour?

No. To call the unicode() builtin, do PyObject_CallFunction(&PyUnicode_Type, "O", param) or some such. PyUnicode_Copy doesn't correspond to any Python-level API.

...

While I agree the name could be better (and PyUnicode_AsExactUnicode would certainly work), why make it private?

I suggest to be minimalistic in extensions to the API. There should be a demonstrated need for an API before adding it, which I don't see in this case. In general, it will be difficult to find a demonstrable need for new APIs, since the majority (more than 99%) of API use cases is already covered by the abstract object API (i.e. what ceval uses). The unicode type in particular has a bad tradition of adding tons of function to the C API, only so we find out a few releases later that the API is obsolete (e.g. needs additional/different parameters), so we carry unused functions around just because some extension module may use them. Regards, Martin

Victor Stinner

10 Dec 10 Dec

12:21 a.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

On 09/12/2011 01:35, Antoine Pitrou wrote:

...

On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner<python-checkins@python.org> wrote:

...
+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

PyUnicode_Copy() can be used to modify a string to create a new string with the same length. It is used for example by str.upper(), str.title(), ... (fixup()). It is also used by str.__getnewargs__(). I am not sure that str.__getnewargs__() must be a copy of str (s.__getnewargs__() is not x). As mentionned by Martin, PyUnicode_Copy() is also used to get "an exact" Unicode object when you have a subtype. We can maybe make the function private. Victor

Antoine Pitrou

1:02 a.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

On Fri, 09 Dec 2011 19:51:14 +0100 Victor Stinner <victor.stinner@haypocalc.com> wrote:

...

On 09/12/2011 01:35, Antoine Pitrou wrote:

...
On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner<python-checkins@python.org> wrote:

...
+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

PyUnicode_Copy() can be used to modify a string to create a new string with the same length. It is used for example by str.upper(), str.title(), ... (fixup()).

Then the doc should mention that the returned string can be modified. Otherwise it's a bit obscure why the function exists. Regards Antoine.

"Martin v. Löwis"

12 Dec 12 Dec

4:14 a.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

Am 09.12.2011 20:32, schrieb Antoine Pitrou:

...

On Fri, 09 Dec 2011 19:51:14 +0100 Victor Stinner <victor.stinner@haypocalc.com> wrote:

...
On 09/12/2011 01:35, Antoine Pitrou wrote:

...
On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner<python-checkins@python.org> wrote:

...
+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

PyUnicode_Copy() can be used to modify a string to create a new string with the same length. It is used for example by str.upper(), str.title(), ... (fixup()).

Then the doc should mention that the returned string can be modified. Otherwise it's a bit obscure why the function exists.

I'm skeptical about this modification part. If you make a copy, it's not clear at all that the new characters that you put in will fit in range with the width of the unicode string. Even decreasing the ordinal of a character may be incorrect as the result may not be canonical anymore. Regards, Martin

Antoine Pitrou

4:16 a.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

Le dimanche 11 décembre 2011 à 23:44 +0100, "Martin v. Löwis" a écrit :

...

Am 09.12.2011 20:32, schrieb Antoine Pitrou:

...
On Fri, 09 Dec 2011 19:51:14 +0100 Victor Stinner <victor.stinner@haypocalc.com> wrote:

...
On 09/12/2011 01:35, Antoine Pitrou wrote:

...
On Fri, 09 Dec 2011 00:16:02 +0100 victor.stinner<python-checkins@python.org> wrote:

...
+.. c:function:: PyObject* PyUnicode_Copy(PyObject *unicode) + + Get a new copy of a Unicode object. + + .. versionadded:: 3.3

I'm not sure I understand. Why would you make a copy of an immutable object?

PyUnicode_Copy() can be used to modify a string to create a new string with the same length. It is used for example by str.upper(), str.title(), ... (fixup()).

Then the doc should mention that the returned string can be modified. Otherwise it's a bit obscure why the function exists.

I'm skeptical about this modification part. If you make a copy, it's not clear at all that the new characters that you put in will fit in range with the width of the unicode string. Even decreasing the ordinal of a character may be incorrect as the result may not be canonical anymore.

Ah, good point. And perhaps a good reason to make the API private. Regards Antoine.

Victor Stinner

6:24 a.m.

New subject: cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

Le vendredi 9 décembre 2011 20:32:16 Antoine Pitrou a écrit :

...

... it's a bit obscure why the function exists.

Yeah ok, I marked the function as private: renamed to _PyUnicode_Copy() and I undocumented it. Victor

4535

Age (days ago)

4538

Last active (days ago)

List overview

Download

9 comments

5 participants

participants (5)

"Martin v. Löwis"
Antoine Pitrou
Barry Warsaw
Nick Coghlan
Victor Stinner

Re: [Python-Dev] cpython: Document PyUnicode_Copy() and PyUnicode_EncodeCodePage()

tags

participants (5)