[BangPypers] parsing xml

Sun Jul 31 21:13:26 CEST 2011

Dhananjay Nene <dhananjay.nene at gmail.com> writes:

[...]

> re.search("<distance>\s*(\d+)\s*</distance>",data).group(1)
>
> would appear to be the most succinct and quite fast. Adjust for whitespace
> as and if necessary.

Whitespace (including newlines), mixed cases etc. 

[...]

> As far as optimisation goes - I can see at least 3 options
>
> a. the minidom performance is acceptable - no further optimisation required
> b. minidom performance is not acceptable - try the regex one
> c. python library performance is not acceptable - switch to 'c'

I'd switch b and c. If elementree is not fast enough, I'd switch to
celementree and if that's not fast enough, I'd try some hand parsing.

> I can imagine people starting with a and then deciding to move along
> the path a->b->c if and as necessary.  I believe starting with b risks
> obfuscating code (imo regex is obfuscated compared to xml nodes -
> YMMV)

As someone who messed with perl for a long time, I can attest to their
power an unmaintainability. I stay away from them unless I really need
them. But yes, people like Larry Wall seem to think in a fundamentally
different way so YMMV.

[...]

-- 
~noufal
http://nibrahim.net.in

I tripped over a hole that was sticking up out of the ground.