[XML-SIG] lxml 2.0alpha1 released
strangest at comcast.net
Sun Sep 2 19:00:40 CEST 2007
Stefan, congratulations. This is definitely useful.
Please talk a bit about the API, and how it differs/varies from
cElementTree, or link to some examples. For example, the node nesting,
the usage of a 'tail' for trailing text. I wonder if lxml offers more of
a DOM compliant node nesting, or if it conforms to the
conventions/oddities of ElemenTree.
Also show us how it differs from BeautifulSoup, which has extremely
robust unicode handling and mangled XML/HTML tag completion, but may
benchmark a bit slower.
Thanks again, and good job!
> Hi all,
> I'm proudly announcing the first alpha release of lxml 2.0.
> ** What is lxml?
> In short: lxml is the most feature-rich and easy-to-use library for working
> with XML and HTML in the Python language.
> lxml is a Pythonic binding for the libxml2 and libxslt libraries. It is unique
> in that it combines the speed and feature completeness of these libraries with
> the simplicity of a native Python API.
> This release features a major cleanup both behind the scenes and at the
> surface, that improves the XML tool integration and makes the API clearer and
> more consistent in many places. The major new addition, however, is the
> lxml.html package, a new toolkit for HTML handling.
> The web site for the pre-2.0 series is online at
> The "what's new" page has a description of the major changes:
> and the ChangeLog has a more detailed list, see below.
> This being an alpha release means that not everything is stable, both in terms
> of crashes and the API. There will be a small number of alpha releases to make
> the advancements publicly available, before the beta releases focus on
> improving the stability.
> I warmly invite everyone to contribute to the final release by discussing the
> API changes and the new features on the mailing list. There is always space
> for improvements!
> There is currently a known problem with Microsoft's compilers, so Windows
> builds may not become available for 2.0alpha1. The next alpha will hopefully
> come with prebuilt binaries for that platform. Building with the more
> standards compliant MinGW compilers should work.
> Note that working on the code now requires Cython (version 0.9.6.5), an
> enhanced fork of Pyrex. lxml therefore no longer ships with a copy of Pyrex
> or Cython, but as usual, building from the distribution sources does not
> require Cython. It can be installed with "easy_install Cython" or from here:
> I hope that lxml 2.0 will become a straight continuation of the success story
> that lxml 1.x was already.
> Have fun,
> 2.0alpha1 (2007-09-02)
> Features added
> * Reimplemented objectify.E for better performance and improved
> integration with objectify. Provides extended type support based on
> registered PyTypes.
> * XSLT objects now support deep copying
> * New makeSubElement() C-API function that allows creating a new
> subelement straight with text, tail and attributes.
> * XPath extension functions can now access the current context node
> (context.context_node) and use a context dictionary
> (context.eval_context) from the context provided in their first
> * HTML tag soup parser based on BeautifulSoup in lxml.html.ElementSoup
> * New module lxml.doctestcompare by Ian Bicking for writing simplified
> doctests based on XML/HTML output. Use by importing lxml.usedoctest or
> lxml.html.usedoctest from within a doctest.
> * New module lxml.cssselect by Ian Bicking for selecting Elements with
> CSS selectors.
> * New package lxml.html written by Ian Bicking for advanced HTML
> * Namespace class setup is now local to the ElementNamespaceClassLookup
> instance and no longer global.
> * Schematron validation (incomplete in libxml2)
> * Additional stringify argument to objectify.PyType() takes a conversion
> function to strings to support setting text values from arbitrary types.
> * Entity support through an Entity factory and element classes. XML
> parsers now have a resolve_entities keyword argument that can be set to
> False to keep entities in the document.
> * column field on error log entries to accompany the line field
> * Error specific messages in XPath parsing and evaluation
> NOTE: for evaluation errors, you will now get an XPathEvalError instead
> of an XPathSyntaxError. To catch both, you can except on XPathError.
> * The regular expression functions in XPath now support passing a node-set
> instead of a string
> * Extended type annotation in objectify: new xsiannotate() function
> * EXSLT RegExp support in standard XPath (not only XSLT)
> Bugs fixed
> * lxml.etree did not check tag/attribute names
> * The XML parser did not report undefined entities as error
> * The text in exceptions raised by XML parsers, validators and XPath
> evaluators now reports the first error that occurred instead of the last
> * Passing '' as XPath namespace prefix did not raise an error
> * Thread safety in XPath evaluators
> Other changes
> * objectify.PyType for None is now called "NoneType"
> * el.getiterator() renamed to el.iter(), following ElementTree 1.3 -
> original name is still available as alias
> * In the public C-API, findOrBuildNodeNs() was replaced by the more
> generic findOrBuildNodeNsPrefix
> * Major refactoring in XPath/XSLT extension function code
> * Network access in parsers disabled by default
> XML-SIG maillist - XML-SIG at python.org
More information about the XML-SIG