Ignoring XML Namespaces with ElementTree

Pete news at redlamb.net
Thu Dec 3 16:39:45 EST 2009


On Dec 3, 2:55 pm, Stefan Behnel <stefan... at behnel.de> wrote:
> Pete, 03.12.2009 19:21:
>
> > Is there anyway to configure ElementTree to ignore the XML namespace?
> > For the past couple months, I've been using minidom to parse an XML
> > file that is generated by a unit within my organization that can't
> > stick with a standard. This hasnt been a problem until recently when
> > the script was provided a 30MB file that once parsed, increased the
> > python memory footprint by 1.0GB and now I'm running into Memory
> > Errors. Based on Google searches and testing it looks like ElementTree
> > is much more efficient with memory and I'd like to switch,
>
> Make sure you use cElementTree, then that's certainly the right choice to make.
>
> > however I'd
> > like to be able to ignore the namespaces. These XML files tend to
> > randomly switch the namespace for no reason and ignoring these
> > namespaces would help the script adapt to the changes. Any help on
> > this would be greatly appreciated. I'm having a hard time finding the
> > answer.
>
> ET uses namespace URIs as part of the tag name, so if you want to ignore
> namespaces, just strip the leading "{...}" (if any) from the tag and work
> with the rest (so-called "local name").
>
> > Additionally, anyone know how ElementTree handle's XML elements that
> > include Unicode?
>
> It's an XML parser, so the answer is: without any difficulties.
>
> Stefan

Perfect... I can work with that. Thanks.



More information about the Python-list mailing list