[issue15984] Wrong documentation for PyUnicode_FromObject()
New submission from Serhiy Storchaka: In the documentation it is written that PyUnicode_FromObject() is a shortcut for PyUnicode_FromEncodedObject(). But PyUnicode_FromObject() is not call PyUnicode_FromEncodedObject() direct nor indirect. PyUnicode_FromObject() works only with unicode and unicode subclass objects, PyUnicode_FromEncodedObject() is not works with unicode objects. ---------- assignee: docs@python components: Documentation messages: 170821 nosy: docs@python, storchaka priority: normal severity: normal status: open title: Wrong documentation for PyUnicode_FromObject() versions: Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Serhiy Storchaka <storchaka@gmail.com>: ---------- stage: -> needs patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Serhiy Storchaka added the comment: This is a bug of 2 -> 3 transition. ---------- type: -> enhancement versions: +Python 3.2, Python 3.4 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Serhiy Storchaka <storchaka@gmail.com>: ---------- keywords: +easy _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Kyle Roberts added the comment: I made a change to the documentation to reflect PyUnicode_FromObject()'s change in implementation details. Let me know if the wording is off or more information is needed. Thanks! ---------- keywords: +patch nosy: +kyle.roberts Added file: http://bugs.python.org/file29702/from_object.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Brian Curtin added the comment: In the "Otherwise it coerces" sentence, obj should probably be ``obj``. ---------- nosy: +brian.curtin _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Terry J. Reedy <tjreedy@udel.edu>: ---------- versions: -Python 3.2 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
R. David Murray added the comment: So (speaking from C API ignorance here), if you pass it a unicode subclass you get back an instance of the base unicode type? Is that what coercion means here? ---------- nosy: +r.david.murray _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Kyle Roberts added the comment: Thanks for the quick responses. Brian: Nice catch, I'll add the ``obj`` shortly. R. David: You're correct, that's exactly what happens, and what coercion means here. The language is almost the same as PyUnicode_FromEncodedObject()'s documentation, but if it's unclear I don't mind changing both. The code's comment offers another way of describing the "type modification": /* For a Unicode subtype that's not a Unicode object, return a true Unicode object with the same data. */ ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Serhiy Storchaka added the comment: Perhaps we should correct a documentation for PyUnicode_FromEncodedObject() too. Coercing doesn't look as right term for decoding. ---------- title: Wrong documentation for PyUnicode_FromObject() -> Wrong documentation for PyUnicode_FromObject() and PyUnicode_FromEncodedObject() _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Kyle Roberts added the comment: I've uploaded a new patch with the obj argument properly marked up. Brian, I think *obj* is what we want based on other examples in that file. It looks like italics is typically used when discussing parameters since each parameter is italicized in the signature. R.David and Serhiy, I thought about the use of "coercion" some more and I think the current wording is fine. "Type coercion/conversion" is a commonly used phrase with some languages, so I think the phrase is applicable here as well (for a python example: http://docs.python.org/release/2.5.2/ref/coercion-rules.html). Let me know if you'd still like to see it changed. Thanks. ---------- Added file: http://bugs.python.org/file30086/from_object_v2.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by STINNER Victor <victor.stinner@gmail.com>: ---------- nosy: +haypo _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
R. David Murray added the comment: Well, while 'coercion' does refer to changing from one type to another, and technically we are doing that here, in OO we generally think of subclasses as more-or-less being of the same type as the superclass. So I think it would be clearer to spell out that we are changing the object type to be that of the superclass. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Serhiy Storchaka added the comment: In other languages usually the world "coercion" is used for implicit conversion, i.e. int->long, int->float, float->complex. str->unicode in Python 2 (that's what PyUnicode_FromObject() does). But the last conversion is not supported in Python 3. The term "coercion" has also been used in Python 2 in the narrow sense (see the __coerce__() method), and in this sense Python 3 does not support "coercion". Therefore, I believe that it is better to avoid the use of this term. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Martin Panter added the comment: Here is a modified patch that avoids “coercion” and is hopefully more explicit. I also fixed the comment in Include/unicodeobject.h. ---------- nosy: +martin.panter stage: needs patch -> patch review versions: +Python 3.5, Python 3.6 -Python 3.3, Python 3.4 Added file: http://bugs.python.org/file42418/from_object_v3.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Martin Panter <vadmium+py@gmail.com>: Removed file: http://bugs.python.org/file42418/from_object_v3.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Martin Panter <vadmium+py@gmail.com>: Added file: http://bugs.python.org/file42419/from_object_v3.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Serhiy Storchaka added the comment: Added a comment on Rietveld. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by supriyanto maftuh,st <supriyantomaftuh@gmail.com>: ---------- components: +Build, Tests, Unicode, Windows, XML hgrepos: +341 nosy: +ezio.melotti, paul.moore, steve.dower, supriyanto maftuh, tim.golden, zach.ware _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Brian Curtin <brian@python.org>: ---------- nosy: -brian.curtin _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Berker Peksag <berker.peksag@gmail.com>: ---------- components: -Build, Tests, Unicode, Windows, XML _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Changes by Zachary Ware <zachary.ware@gmail.com>: ---------- hgrepos: -341 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Martin Panter added the comment: Here is a new version where I use the phrase “true Unicode object”. ---------- Added file: http://bugs.python.org/file42452/from_object_v4.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Serhiy Storchaka added the comment: LGTM. Thank you Martin for this improvement. ---------- assignee: docs@python -> martin.panter stage: patch review -> commit review _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
STINNER Victor added the comment: from_object_v4.patch LGTM, nice enhancement. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Roundup Robot added the comment: New changeset af655e73f7bd by Martin Panter in branch '3.5': Issue #15984: Correct PyUnicode_FromObject() and _FromEncodedObject() docs https://hg.python.org/cpython/rev/af655e73f7bd New changeset 570ada02d0f0 by Martin Panter in branch 'default': Issue #15984: Merge PyUnicode doc from 3.5 https://hg.python.org/cpython/rev/570ada02d0f0 ---------- nosy: +python-dev _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
Martin Panter added the comment: I also tweaked the PyUnicode_FromEncodedObject() documentation to avoid the word “coerce” and to fix up outdated stuff. ---------- resolution: -> fixed stage: commit review -> resolved status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15984> _______________________________________
participants (11)
-
Berker Peksag
-
Brian Curtin
-
Kyle Roberts
-
Martin Panter
-
R. David Murray
-
Roundup Robot
-
Serhiy Storchaka
-
STINNER Victor
-
supriyanto maftuh,st
-
Terry J. Reedy
-
Zachary Ware