[issue10435] Document unicode C-API in reST

Alexander Belopolsky report at bugs.python.org
Mon Nov 22 20:00:59 CET 2010


Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

On Wed, Nov 17, 2010 at 5:20 PM, Marc-Andre Lemburg
<report at bugs.python.org> wrote:
..
> -/* Encodes a Unicode object and returns the result as Python string
> +/* Encodes a Unicode object and returns the result as Python bytes
>    object. */
>
>
> PyUnicode_AsEncodedObject() encodes the Unicode object to
> whatever the codec returns, so the "bytes" is wrong in the
> above line.
>

The above line describes PyUnicode_AsEncodedString(), not
PyUnicode_AsEncodedObject().  The former has PyBytes_Check(v) after
calling  v = PyCodec_Encode(..).  As far as I can tell this is the
only difference that makes PyUnicode_AsEncodedObject() not redundant.

..
> +.. c:function:: PyObject* PyUnicode_AsDecodedObject(PyObject *unicode, const char *encoding, const char *errors)
>
> +   Create a Unicode object by decoding the encoded Unicode object
> +   *unicode*.
>
> The function does not guarantee that a Unicode object will be
> returned. It merely passes a Unicode object to a codec's
> decode function and returns whatever the codec returns.
>

Good point.  I am changing "Unicode object" to "Python object".

..
> +   Note that Python codecs do not accept Unicode objects for decoding,
> +   so this method is only useful with user or 3rd party codecs.
>
> Please strike the last sentence. The codecs that were wrongly removed
> from Python3 will get added back and provide such functionality.
>

Would it be acceptable to keep this note, but add "as of version 3.2"
or something like that?   I don't think there is a chance that these
codecs will be added in 3.2 given the current schedule.

..
> This should read:
>
>   Decodes a Unicode object by passing the given Unicode object
>   *unicode* to the codec for *encoding*.
>   *encoding* and *errors* have the same meaning as the
>   parameters of the same name in the :func:`unicode` built-in
>   function.  The codec to be used is looked up using the Python codec
>   registry.  Return *NULL* if an exception was raised by the codec.
>

Is the following better?

"""
    Decodes a Unicode object by passing the given Unicode object
    *unicode* to the codec for *encoding*.  *encoding* and *errors*
    have the same meaning as the parameters of the same name in the
    :func:`unicode` built-in  function. The codec to be used is
    looked up using the Python codec registry. Return *NULL* if an
    exception was raised by the codec.

    As of Python 3.2, this method is only useful with user or 3rd
    party codec that encodes string into something other than bytes.
    For encoding to bytes, use c:func:`PyUnicode_AsEncodedString`
    instead.
"""
..
>
> +.. c:function:: void PyUnicode_Append(PyObject **pleft, PyObject *right)
..
> +
> +.. c:function:: void PyUnicode_AppendAndDel(PyObject **pleft, PyObject *right)
..
>
> Please don't document these two obscure APIs. Instead we should
> make them private functions by prepending them with an underscore.
> If you look at the implementations of those two APIs, they
> are little more than a macros around PyUnicode_Concat().
>

I don't agree that they are obscure.  Python uses them in multiple
places and developers seem to know about them.  See patches submitted
to issue4113 and issue7584.

> 3rd party extensions should use PyUnicode_Concat() to achieve
> the same effect.
>

Hmm.  I would not be surprised if current 3rd party extensions used
PyUnicode_AppendAndDel() more often than PyUnicode_Concat().  (I know
that I learned about PyUnicode_AppendAndDel()  before
PyUnicode_Concat().)

Is there anything that makes PyUnicode_AppendAndDel() undesirable?   I
don't mind adding a recommendation to use PyUnicode_Concat() if there
is a practical reason for it or even a warning that
PyUnicode_AppendAndDel() may be deprecated in the future, but renaming
it to _PyUnicode_AppendAndDel() seems premature.

..
>
> I don't think it's a good idea to make this a public API.
> 3rd party extensions should not need to make use of such
> APIs.
>
> Instead, we should make this a private API.

I agree, but isn't it prudent to document it as deprecated for 3rd
party use first?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10435>
_______________________________________


More information about the Python-bugs-list mailing list