[BangPypers] parsing xml

Fri Jul 29 04:38:40 CEST 2011

On Fri, Jul 29, 2011 at 1:23 AM, Gora Mohanty <gora at mimirtech.com> wrote:

> On Thu, Jul 28, 2011 at 10:37 PM, Venkatraman S <venkat83 at gmail.com>
> wrote:
> > parsing using minidom is one of the slowest. if you just want to extract
> the
> > distance and assuming that it(the tag) will always be consistent, then i
> > would always suggest regexp. xml parsing is a pain.
> [...]
>
> Strongly disagree. IMHO, regexps are the wrong solution
> for parsing XML (or, any kind of well-structured text), as
> they end up becoming intolerably complex, and do not
> degrade gracefully for broken XML.
>
> Have not compared speeds myself, but there are blogs
> that go into that. In my experience, the cleanest, most
> efficient, and richest-in-features Python XML library is
> lxml. For people used to BeautifulSoup, lxml has a
> BeautifulSoup parser, and is significantly more efficient.
>
>
If it's a questions of the fastest gun around it must be cElementTree, and
please refer the table somewhere towards bottom of the page. Caveat, the
page belongs to effbot who is written the package.

http://effbot.org/zone/celementtree.htm

> Regards,
> Gora
> _______________________________________________
> BangPypers mailing list
> BangPypers at python.org
> http://mail.python.org/mailman/listinfo/bangpypers
>

-- 
Ramdas S
+91 9342 583 065