adding the XML to 2.0 to be a mistake?
Martin von Loewis
loewis at informatik.hu-berlin.de
Fri Jan 19 08:19:17 EST 2001
rjroy at takingcontrol.com (Robert Roy) writes:
> I agree with what you are saying. Another aspect that concerns me is
> that with the addition of the XML tools, xmllib is now deprecated. The
> recommended alternative, SAX, does not offer the level of control that
> xmllib does.
I don't know why it is deprecated. I'm in favour of un-deprecating it.
> Undeclared entities are a problem in SAX but can be handled cleanly
> using the unknown_entityref mecanism in xmllib.
They are not that much of a problem - there is the skippedEntity
routine.
> SAX as is now stands, handles marked CDATA sections as part of the
> data stream. This may be fine for most uses but may be undesireable
> when you are using XML as a container wrapping sections of html etc...
>
> SAX swallows comments. Maybe I need to pass the comments through eg:
> if doing a translation.
Agreed; there are possible applications for that.
> I believe that in light of its unique capabilities, xmllib deserves a
> permanent place in the Python library. To improve performance (in
> sgmllib too), Fredrik's sgmlop extension should become part of the
> standard distribution as well.
It seems to me that sgmlop has many problems: it doesn't work properly
(try running the PyXML test suites with only an sgmlop driver); and it
does not support Unicode.
> If at some point (py)expat evolves to the point where it can do
> everything that xmllib does (allow access to ALL entity refs,
> identification of marked sections, comments ...), then it could
> become the underlying library.
Today, pyexpat allows access to all entity refs to day, it reports the
start and the end of CDATA sections, and it reports comments. So while
I can follow the problems you have with SAX, I can't see why you
complain about pyexpat.
Regards,
Martin
More information about the Python-list
mailing list