ElementTree.XML(string XML) and ElementTree.fromstring(string XML) not working

Kee Nethery kee at kagi.com
Fri Jun 26 15:56:24 EDT 2009


First, thanks to everyone who responded. Figured I'd test all the  
suggestions and provide a response to the list. Here goes ...

On Jun 25, 2009, at 7:38 PM, Nobody wrote:

> Why do you need an ElementTree rather than an Element? XML(string)  
> returns
> the root element, as if you had used et.parse(f).getroot(). You can  
> turn
> this into an ElementTree with e.g. et.ElementTree(XML(string)).

I tried this:
    et.ElementTree(XML(theXmlData))
and it did not work.

I had to modify it to this to get it to work:
    et.ElementTree(et.XML(theXmlData))


>> formPostData = cgi.FieldStorage()
>> theXmlData = formPostData['theXml'].value
>> theXmlDataTree =
>> et 
>> .parse 
>> (makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))
>
> If you want to treat a string as a file, use StringIO.

I tried this:
    import StringIO
    theXmlDataTree = et.parse(StringIO.StringIO(theXmlData))
    orderXml = theXmlDataTree.findall('purchase')

and it did work. StringIO converts the string into what looks like a  
file so parse can process it as a file. Cool.

On Jun 25, 2009, at 7:47 PM, unayok wrote:

> I'm not sure what you're expecting.  It looks to me like things are
> working okay:
>
> My test script:
>
> [snip]

I agree your code works.

When I tried:
    theXmlDataTree = et.fromstring(theXmlData)
    orderXml = theXmlDataTree.findall('purchase')

When I modified mine to programmatically look inside using the "for  
element in theXmlDataTree" I was able to see the contents. The  
debugger I am using does not offer me a window into the ElementTree  
data and that was part of the problem. So yes, et.fromstring is  
working correctly. It helps to have someone show me the missing step  
needed to confirm the code works and the IDE does not.



On Jun 25, 2009, at 8:04 PM, Carl Banks wrote:
> I believe you are misunderstanding something.  et.XML and
> et.fromstring return Elements, whereas et.parse returns an
> ElementTree.  These are two different things; however, both of them
> "contain all the XML".  In fact, an ElementTree (which is returned by
> et.parse) is just a container for the root Element (returned by
> et.fromstring)--and it adds no important functionality to the root
> Element as far as I can tell.

Thank you for explaining the difference. I absolutely was  
misunderstanding this.

> Given an Element (as returned by et.XML or et.fromstring) you can pass
> it to the ElementTree constructor to get an ElementTree instance.  The
> following line should give you something you can "play with":
>
> theXmlDataTree = et.ElementTree(et.fromstring(theXmlData))

Yes this works.



On Jun 25, 2009, at 11:39 PM, Stefan Behnel wrote:

> If you want to parse a document from a file or file-like object, use
> parse(). Three use cases, three functions. The fourth use case of  
> parsing a
> document from a string does not have its own function, because it is
> trivial to write
>
> 	tree = parse(BytesIO(some_byte_string))

:-) Trivial for someone familiar with the language. For a newbie like  
me, that step was non-obvious.

> If what you meant is actually parsing from a byte string, this is  
> easily
> done using BytesIO(), or StringIO() in Py2.x (x<6).

Yes, thanks! Looks like BytesIO is a v.3.x enhancement. Looks like the  
StringIO does what I need since all I'm doing is pulling the unicode  
string into et.parse. Am guessing that either would work equally well.


>> theXmlDataTree =
> et 
> .parse 
> (makeThisUnicodeStringLookLikeAFileSoParseWillDealWithIt(theXmlData))
>
> This will not work because ET cannot parse from unicode strings  
> (unless
> they only contain plain ASCII characters and you happen to be using  
> Python
> 2.x). lxml can parse from unicode strings, but it requires that the  
> XML
> must not have an encoding declaration (which would render it non
> well-formed). This is convenient for parsing HTML, it's less  
> convenient for
> XML usually.

Right for my example, if the data is coming in as UTF-8 I believe I  
can do:
    theXmlDataTree = et.parse(StringIO.StringIO(theXmlData), encoding  
='utf-8')


Again, as a newbie, thanks to everyone who took the time to respond.  
Very helpful.
Kee



More information about the Python-list mailing list