[BangPypers] parsing xml

Fri Jul 29 08:04:17 CEST 2011

On Fri, Jul 29, 2011 at 11:15 AM, Anand Chitipothu <anandology at gmail.com>wrote:

> 2011/7/29 Venkatraman S <venkat83 at gmail.com>:
> > On Fri, Jul 29, 2011 at 10:47 AM, Anand Chitipothu <anandology at gmail.com
> >wrote:
> >
> >> 2011/7/28 Venkatraman S <venkat83 at gmail.com>:
> >> > parsing using minidom is one of the slowest. if you just want to
> extract
> >> the
> >> > distance and assuming that it(the tag) will always be consistent, then
> i
> >> > would always suggest regexp. xml parsing is a pain.
> >>
> >> regexp is a bad solution to parse xml.
> >>
> >> minidom is the fastest solution if you consider the programmer time
> >> instead of developer time.  Minidom is available in standard library,
> >> you don't have to add another dependency and worry about PyPI
> >> downtimes and lxml compilations failures.
> >>
> >> I don't think there will be significant performance difference between
> >> regexp and minidom unless you are doing it a million times.
> >>
> >>
> > Well, i have clearly mentioned my assumptions - i.e, when you treat the
> XML
> > as a 'string' and do not want
> > to retrieve anything else in a 'structured manner'. I am a speed-maniac
> and
> > crave for speed; so if the assumption is valid,
> > i can vouch for the fact that regexp would be faster and neater solution.
> I
> > have done some speed experiments
> > in past on this (results of which i do not have handy), and i found this.
> >
> > XP asks you implement the best solution with the least effort and i think
> in
> > this case regexp is a winner. Thoughts can vary though.
>
> regexp can at the best be a dirty-hack, not a best solution for xml
> parsing.
>
>
read again : i am not actually working on 'xml' (see my assumption?).