[BangPypers] parsing xml

Fri Jul 29 07:26:15 CEST 2011

On Fri, Jul 29, 2011 at 10:47 AM, Anand Chitipothu <anandology at gmail.com>wrote:

> 2011/7/28 Venkatraman S <venkat83 at gmail.com>:
> > parsing using minidom is one of the slowest. if you just want to extract
> the
> > distance and assuming that it(the tag) will always be consistent, then i
> > would always suggest regexp. xml parsing is a pain.
>
> regexp is a bad solution to parse xml.
>
> minidom is the fastest solution if you consider the programmer time
> instead of developer time.  Minidom is available in standard library,
> you don't have to add another dependency and worry about PyPI
> downtimes and lxml compilations failures.
>
> I don't think there will be significant performance difference between
> regexp and minidom unless you are doing it a million times.
>
>
Well, i have clearly mentioned my assumptions - i.e, when you treat the XML
as a 'string' and do not want
to retrieve anything else in a 'structured manner'. I am a speed-maniac and
crave for speed; so if the assumption is valid,
i can vouch for the fact that regexp would be faster and neater solution. I
have done some speed experiments
in past on this (results of which i do not have handy), and i found this.

XP asks you implement the best solution with the least effort and i think in
this case regexp is a winner. Thoughts can vary though.