Nov. 29, 2007
6:41 p.m.
Frederik Elwert napisaĆ:
No, I think the better way would be to parse it, look for the encoding (either by looking at <tree>.docinfo.encoding or looking for the meta-Tag with find()), and then reparse the unaltered document, now using the "encoding" keyword. This is what Stefan suggests: http://article.gmane.org/gmane.comp.python.lxml.devel/3001/
Hi, thanks for suggestion. But how can I pass the "encoding" keyword? Neither etree.parse nor etree.HTMLParser supports it.