adding the XML to 2.0 to be a mistake?

Martin von Loewis loewis at informatik.hu-berlin.de
Fri Jan 19 08:19:17 EST 2001


rjroy at takingcontrol.com (Robert Roy) writes:

> I agree with what you are saying. Another aspect that concerns me is
> that with the addition of the XML tools, xmllib is now deprecated. The
> recommended alternative, SAX, does not offer the level of control that
> xmllib does.

I don't know why it is deprecated. I'm in favour of un-deprecating it.

> Undeclared entities are a problem in SAX but can be handled cleanly
> using the unknown_entityref mecanism in xmllib.

They are not that much of a problem - there is the skippedEntity
routine.

> SAX as is now stands, handles marked CDATA sections as part of the
> data stream. This may be fine for most uses but may be undesireable
> when you are using XML as a container wrapping sections of html etc...
> 
> SAX swallows comments. Maybe I need to pass the comments through eg:
> if doing a translation.

Agreed; there are possible applications for that.

> I believe that in light of its unique capabilities, xmllib deserves a
> permanent place in the Python library. To improve performance (in
> sgmllib too), Fredrik's sgmlop extension should become part of the
> standard distribution as well. 

It seems to me that sgmlop has many problems: it doesn't work properly
(try running the PyXML test suites with only an sgmlop driver); and it
does not support Unicode.

> If at some point (py)expat evolves to the point where it can do
> everything that xmllib does (allow access to ALL entity refs,
> identification of marked sections, comments ...), then it could
> become the underlying library.

Today, pyexpat allows access to all entity refs to day, it reports the
start and the end of CDATA sections, and it reports comments. So while
I can follow the problems you have with SAX, I can't see why you
complain about pyexpat.

Regards,
Martin



More information about the Python-list mailing list