Hello list,
I'm using lxml to parse both html and xhtml files and it works quite well.
But I've noticed a difference when using the html.tostring()
I have a meta tag that look like this.
<meta name="description" content="Test ö" />
I parse it with the HTML parser and then try to output it using tostring()
html.tostring(tag, encoding="unicode", method='xml')
which results in:
<meta name="description" content="Test ö" />
but using html method I get this
html.tostring(tag, encoding="unicode", method='html')
which results in:
<meta name="description" content="Test ö" >
Why does the xml method start converting to entities and codes? How can I simply make the method="xml" output my unicode characters?
Cheers,
Henrik
BTW I can't use the method="html" as the tag loses its end slash /.