Parsing unicode (devanagari) text with xml.dom.minidom
stefan_ml at behnel.de
Sun Mar 8 10:38:40 CET 2009
Martin v. Löwis wrote:
>> Regarding minidom, you might be happier with the xml.etree package that
>> comes with Python2.5 and later (it's also avalable for older versions).
>> It's a lot easier to use, more memory friendly and also much faster.
> OTOH, choice of XML library is completely irrelevant for the issue at
For the described problem, maybe. But certainly not for the application.
The background was parsing the XML dump of an entire web site, which I
would expect to be larger than what minidom is designed to handle
gracefully. Switching to cElementTree before major code gets written is
almost certainly a good idea here.
More information about the Python-list