[issue10942] xml.etree.ElementTree.tostring returns type bytes, expected type str
New submission from J_Tom_Moon_79 <jtm.moon.forum.user+python@gmail.com>: method xml.etree.ElementTree.tostring from module returns type bytes. The documentation reads """Returns an encoded string containing the XML data.""" (from http://docs.python.org/py3k/library/xml.etree.elementtree.html#xml.etree.Ele... as of 2011-01-18) ======================================================= Here is a test program: ------------------------------------------------------- #!/usr/bin/python # created for python 3.1 import sys print(sys.version) # for help verifying version tested from xml.etree import ElementTree sampleinput = """<?xml version="1.0"?><Hello></Hello>""" xmlobj = ElementTree.fromstring(sampleinput) type(xmlobj) xmlstr = ElementTree.tostring(xmlobj,'utf-8') print("xmlstr value is '", xmlstr, "'", sep="") print("xmlstr type is '", type(xmlstr), "'", sep="") ------------------------------------------------------- test program output: ------------------------------------------------------- 3.1.3 (r313:86834, Nov 27 2010, 18:30:53) [MSC v.1500 32 bit (Intel)] xmlstr value is 'b'<Hello />'' xmlstr type is '<class 'bytes'>' ======================================================= This cheap "fix" for this bug may be simply be a change in documentation. However, a method called "tostring" really should return something nearer to the built-in str. ---------- assignee: docs@python components: Documentation, XML messages: 126506 nosy: JTMoon79, docs@python priority: normal severity: normal status: open title: xml.etree.ElementTree.tostring returns type bytes, expected type str type: behavior versions: Python 3.1 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10942> _______________________________________
J_Tom_Moon_79 <jtm.moon.forum.user+python@gmail.com> added the comment: Some other bugs affecting the tostring method (for consideration by the reviewer): http://bugs.python.org/issue6233#msg89718 http://bugs.python.org/msg101037 http://bugs.python.org/issue9692 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10942> _______________________________________
R. David Murray <rdmurray@bitdance.com> added the comment: This is indeed a doc problem, although there was some discussion of working toward a method rename. See issue 8047 (but be prepared to read a novel to understand why tostring returns bytes...) The doc for 3.2 is slightly clearer, but both 3.1 and 3.2 could be made clearer by referring to an 'encoded byte string' rather than just an 'encoded string'. (An encoded string has to be a byte string, but that isn't obvious unless you've dealt with encode/decode a bunch.) Technically this could be closed as a duplicate of issue 8047, since that issue proposes that the API fix (which would include the doc change) be backported to 3.1. But no one has proposed a patch there, so at a minimum the 3.1 docs should be clarified. ---------- nosy: +r.david.murray stage: -> needs patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10942> _______________________________________
Gunnar Eikman <gunnar.eikman@gmail.com> added the comment: I moved a working script from Ubuntu (Python 3.1.2) to Windows (Python 3.2.3) today. Had to revise script. The tostring method returns a string on Linux (contradicts this issue), but bytes on Windows (as described in this issue)... I used tostring with a single argument "tostring(theXml)" Is there an explanation for this? I am not an advanced Python hacker... Be careful when moving from one environment to another! ---------- nosy: +Gunnar.Eikman _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10942> _______________________________________
Serhiy Storchaka added the comment: For now the documentation explains the resulting type of tostring(). https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.Eleme... """ Generates a string representation of an XML element, including all subelements. element is an Element instance. encoding [1] is the output encoding (default is US-ASCII). Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated). method is either "xml", "html" or "text" (default is "xml"). """ Looks as this issue can be closed. ---------- nosy: +serhiy.storchaka status: open -> pending _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10942> _______________________________________
Changes by Serhiy Storchaka <storchaka@gmail.com>: ---------- resolution: -> out of date stage: needs patch -> resolved status: pending -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue10942> _______________________________________
participants (4)
-
Gunnar Eikman
-
J_Tom_Moon_79
-
R. David Murray
-
Serhiy Storchaka