[ANN] lxml 1.0 released

Stefan Behnel stefan.behnel-n05pAM at web.de
Fri Jun 2 12:53:29 CEST 2006


Hallo everyone,

I have the honour to announce the availability of lxml 1.0.

http://codespeak.net/lxml/

It's downloadable from cheeseshop:
http://cheeseshop.python.org/pypi/lxml

"""
lxml is a Pythonic binding for the libxml2 and libxslt libraries. It provides
safe and convenient access to these libraries using the ElementTree API. It
extends the ElementTree API significantly to offer support for XPath, RelaxNG,
XML Schema, XSLT, C14N and much, much more.

Its goals are:

    * Pythonic API.
    * Documented.
      http://codespeak.net/lxml/#documentation
    * FAST!
      http://codespeak.net/lxml/performance.html
    * Use Python unicode strings in API.
    * Safe (no segfaults).
    * No manual memory management!
      (as opposed to the official libxml2 Python bindings)
"""

While the list of features added since the last beta version (1.0.beta) is
rather small, this version contains a large number of bug fixes found by
various users and testers. Thank you all for your help!

Stefan


Features added since 0.9.2:

    * Element.getiterator() and the findall() methods support finding
      arbitrary elements from a namespace (pattern {namespace}*)
    * Another speedup in tree iteration code
    * General speedup of Python Element object creation and deallocation
    * Writing C14N no longer serializes in memory (reduced memory footprint)
    * PyErrorLog for error logging through the Python logging module
    * element.getroottree() returns an ElementTree for the root node of the
      document that contains the element.
    * ElementTree.getpath(element) returns a simple, absolute XPath expression
      to find the element in the tree structure
    * Error logs have a last_error attribute for convenience
    * Comment texts can be changed through the API
    * Formatted output via pretty_print keyword to serialization functions
    * XSLT can block access to file system and network via XSLTAccessControl
    * ElementTree.write() no longer serializes in memory (reduced memory
      footprint)
    * Speedup of Element.findall(tag) and Element.getiterator(tag)
    * Support for writing the XML representation of Elements and ElementTrees
      to Python unicode strings via etree.tounicode()
    * Support for writing XSLT results to Python unicode strings via unicode()
    * Parsing a unicode string no longer copies the string (reduced memory
      footprint)
    * Parsing file-like objects now reads chunks rather than the whole file
      (reduced memory footprint)
    * Parsing StringIO objects from the start avoids copying the string
      (reduced memory footprint)
    * Read-only 'docinfo' attribute in ElementTree class holds DOCTYPE
      information, original encoding and XML version as seen by the parser
    * etree module can be compiled without libxslt by commenting out the line
      include "xslt.pxi" near the end of the etree.pyx source file
    * Better error messages in parser exceptions
    * Error reporting now also works in XSLT
    * Support for custom document loaders (URI resolvers) in parsers and XSLT,
      resolvers are registered at parser level
    * Implementation of exslt:regexp for XSLT based on the Python 're' module,
      enabled by default, can be switched off with 'regexp=False' keyword
      argument
    * Support for exslt extensions (libexslt) and libxslt extra functions
      (node-set, document, write, output)
    * Substantial speedup in XPath.evaluate()
    * HTMLParser for parsing (broken) HTML
    * XMLDTDID function parses XML into tuple (root node, ID dict) based on
      xml:id implementation of libxml2 (as opposed to ET compatible XMLID)


Bugs fixed since 0.9.2:

    * Memory leak in Element.__setitem__
    * Memory leak in Element.attrib.items() and Element.attrib.values()
    * Memory leak in XPath extension functions
    * Memory leak in unicode related setup code
    * Element now raises ValueError on empty tag names
    * Namespace fixing after moving elements between documents could fail if
      the source document was freed too early
    * Setting namespace-less tag names on namespaced elements ('{ns}t' -> 't')
      didn't reset the namespace
    * Unknown constants from newer libxml2 versions could raise exceptions in
      the error handlers
    * lxml.etree compiles much faster
    * On libxml2 <= 2.6.22, parsing strings with encoding declaration could
      fail in certain cases
    * Document reference in ElementTree objects was not updated when the root
      element was moved to a different document
    * Running absolute XPath expressions on an Element now evaluates against
      the root tree
    * Evaluating absolute XPath expressions (/*) on an ElementTree could fail
    * Crashes when calling XSLT, RelaxNG, etc. with uninitialized ElementTree
      objects
    * Memory leak when using iconv encoders in tostring/write
    * Deep copying Elements and ElementTrees maintains the document
      information
    * Serialization functions raise LookupError for unknown encodings
    * Memory deallocation crash resulting from deep copying elements
    * Some ElementTree methods could crash if the root node was not
      initialized (neither file nor element passed to the constructor)
    * Element/SubElement failed to set attribute namespaces from passed attrib
      dictionary
    * tostring() now adds an XML declaration for non-ASCII encodings
    * tostring() failed to serialize encodings that contain 0-bytes
    * ElementTree.xpath() and XPathDocumentEvaluator were not using the
      ElementTree root node as reference point
    * Calling document('') in XSLT failed to return the stylesheet


More information about the Python-announce-list mailing list