[BangPypers] parsing xml

Dhananjay Nene dhananjay.nene at gmail.com
Mon Aug 1 09:48:31 CEST 2011


On Mon, Aug 1, 2011 at 12:43 AM, Noufal Ibrahim <noufal at gmail.com> wrote:

> Dhananjay Nene <dhananjay.nene at gmail.com> writes:
>
>
> [...]
>
> > re.search("<distance>\s*(\d+)\s*</distance>",data).group(1)
> >
> > would appear to be the most succinct and quite fast. Adjust for
> whitespace
> > as and if necessary.
>
> Whitespace (including newlines), mixed cases etc.
>
> Actually newlines are handled in the regex above. (so no longer sure why I
even mentioned it), XML (assuming it is as per spec) is not mixed case.


> [...]
>
> > As far as optimisation goes - I can see at least 3 options
> >
> > a. the minidom performance is acceptable - no further optimisation
> required
> > b. minidom performance is not acceptable - try the regex one
> > c. python library performance is not acceptable - switch to 'c'
>
> I'd switch b and c. If elementree is not fast enough, I'd switch to
> celementree and if that's not fast enough, I'd try some hand parsing.
>
+1

Dhananjay


More information about the BangPypers mailing list