lxml 1.3 released
stefan.behnel-n05pAM at web.de
Sun Jun 24 13:07:34 CEST 2007
I'm proud to announce the release of lxml 1.3.
** What is lxml?
In short: lxml is the most feature-rich and easy-to-use library for working
with XML and HTML in the Python language.
lxml is a Pythonic binding for the libxml2 and libxslt libraries. It is unique
in that it combines the speed and feature completeness of these libraries with
the simplicity of a native Python API.
** This is a major new release with various new features and lots of fixes
compared to the 1.2 series. The complete changelog follows below.
Major objectives of this release were:
- API consolidation:
make everything work with everything
- improved namespace handling:
avoid redundant namespaces wherever possible
- simplicity and accessibility:
improved, restructured documentation and simpler XML/HTML generation
Future versions of lxml will continue this trend to make lxml the leading tool
for XML and HTML in the Python world.
* Module ``lxml.pyclasslookup`` implemens an Element class lookup scheme that
can access the entire tree to determine a suitable Element class
* Parsers take a ``remove_comments`` keyword argument that skips over comments
* ``parse()`` function in ``objectify``, corresponding to ``XML()`` etc.
* ``Element.addnext(el)`` and ``Element.addprevious(el)`` methods to support
adding processing instructions and comments around the root node
* Extended type annotation in objectify: cleaner annotation namespace setup
plus new ``deannotate()`` function
* Support for custom Element class instantiation in lxml.sax: passing a
``makeelement()`` function to the ElementTreeContentHandler will reuse the
lookup context of that function
* '.' represents empty ObjectPath (identity)
* Removing Elements from a tree could make them loose their namespace
* ``ElementInclude`` didn't honour base URL of original document
* Replacing the children slice of an Element would cut off the tails of the
* ``Element.getiterator(tag)`` did not accept ``Comment`` and
``ProcessingInstruction`` as tags
* API functions now check incoming strings for XML conformity. Zero bytes or
low ASCII characters are no longer accepted.
* XSLT parsing failed to pass resolver context on to imported documents
* More ET compatible behaviour when writing out XML declarations or not
* ``Element.attrib`` was missing ``clear()`` and ``pop()`` methods
* More robust error handling in ``iterparse()``
* Documents lost their top-level PIs and comments on serialisation
* lxml.sax failed on comments and PIs. Comments are now properly ignored and
PIs are copied.
* Raise AssertionError when passing strings containing '\0' bytes
* ``DTD`` validator class (like ``RelaxNG`` and ``XMLSchema``)
* HTML generator helpers by Fredrik Lundh in ``lxml.htmlbuilder``
* ``ElementMaker`` XML generator by Fredrik Lundh in ``lxml.builder.E``
* Support for pickeling ``objectify.ObjectifiedElement`` objects to XML
* ``update()`` method on Element.attrib
* Optimised replacement for libxml2's _xmlReconsiliateNs(). This allows lxml
a better handling of namespaces when moving elements between documents.
* Possible memory leaks in namespace handling when moving elements between
* major restructuring in the documentation
More information about the Python-announce-list