lxml 2.2 released

Stefan Behnel stefan_ml at behnel.de
Sat Mar 21 17:16:46 CET 2009

Hi all,

I'm proud to announce the release of lxml 2.2 final.


This is a major new, stable and mature release that takes over the stable
2.x release series. All previous 2.x releases are now officially out of

It includes a large number of bug fixes and improvements (see below for a
complete changelog) that make lxml 2.2 a lot more robust than the previous
2.1 and older releases. It is therefore generally worth upgrading (and it
should not be too hard to do that).

This release was built with Cython 0.11 final.

Have fun,

What is lxml?

lxml is the most feature-rich and easy-to-use library for working with XML
and HTML in the Python language. It's also amongst the fastest and most
memory friendly XML tree libraries for Python.

lxml is a pythonic, mature binding for the libxml2 and libxslt libraries
that provides safe and convenient access to these libraries using the
ElementTree API. It extends the ElementTree API significantly to offer
support for XPath, RelaxNG, XML Schema, XSLT, C14N and much more.

Changelog since lxml 2.1

2.2 (2009-03-21)
Features added

    * Support for standalone flag in XML declaration through
      tree.docinfo.standalone and by passing standalone=True/False on

Bugs fixed

    * Crash when parsing an XML Schema with external imports from a

2.2beta4 (2009-02-27)
Features added

    * Support strings and instantiable Element classes as child arguments
      to the constructor of custom Element classes.
    * GZip compression support for serialisation to files and file-like

Bugs fixed

    * Deep-copying an ElementTree copied neither its sibling PIs and
      comments nor its internal/external DTD subsets.
    * Soupparser failed on broken attributes without values.
    * Crash in XSLT when overwriting an already defined attribute using
    * Crash bug in exception handling code under Python 3. This was due to
      a problem in Cython, not lxml itself.
    * lxml.html.FormElement._name() failed for non top-level forms.
    * TAG special attribute in constructor of custom Element classes was
      evaluated incorrectly.

Other changes

    * Official support for Python 3.0.1.
    * Element.findtext() now returns an empty string instead of None for
      Elements without text content.

2.2beta3 (2009-02-17)
Features added

    * XSLT.strparam() class method to wrap quoted string parameters that
      require escaping.

Bugs fixed

    * Memory leak in XPath evaluators.
    * Crash when parsing indented XML in one thread and merging it with
      other documents parsed in another thread.
    * Setting the base attribute in lxml.objectify from a unicode string
    * Fixes following changes in Python 3.0.1.
    * Minor fixes for Python 3.

Other changes

    * The global error log (which is copied into the exception log) is now
      local to a thread, which fixes some race conditions.
    * More robust error handling on serialisation.

2.2beta2 (2009-01-25)
Bugs fixed

    * Potential memory leak on exception handling. This was due to a
      problem in Cython, not lxml itself.
    * iter_links (and related link-rewriting functions) in lxml.html would
      interpret CSS like url("link") incorrectly (treating the quotation
      marks as part of the link).
    * Failing import on systems that have an io module.

2.2beta1 (2008-12-12)
Features added

    * Allow lxml.html.diff.htmldiff to accept Element objects, not just
      HTML strings.

Bugs fixed

    * Crash when using an XPath evaluator in multiple threads.
    * Fixed missing whitespace before Link:... in lxml.html.diff.

Other changes

    * Export lxml.html.parse.

2.2alpha1 (2008-11-23)
Features added

    * Support for XSLT result tree fragments in XPath/XSLT extension
    * QName objects have new properties namespace and localname.
    * New options for exclusive C14N and C14N without comments.
    * Instantiating a custom Element classes creates a new Element.

Bugs fixed

    * XSLT didn't inherit the parse options of the input document.
    * 0-bytes could slip through the API when used inside of Unicode
    * With lxml.html.clean.autolink, links with balanced parenthesis, that
      end in a parenthesis, will be linked in their entirety (typical with
      Wikipedia links).

More information about the Python-announce-list mailing list