I just released lxml 4.0.0 with several new features. Thanks to everyone who contributed.
The documentation is here: http://lxml.de/
This release was built using Cython 0.26.1.
If you are interested in commercial support or customisations for the lxml package, please contact me directly.
4.0.0 (2017-09-17) ==================
Features added --------------
* The ElementPath implementation is now compiled using Cython, which speeds up the ``.find*()`` methods quite significantly.
* The modules ``lxml.builder``, ``lxml.html.diff`` and ``lxml.html.clean`` are also compiled using Cython in order to speed them up.
* ``xmlfile()`` supports async coroutines using ``async with`` and ``await``.
* ``iterwalk()`` has a new method ``skip_subtree()`` that prevents walking into the descendants of the current element.
* ``RelaxNG.from_rnc_string()`` accepts a ``base_url`` argument to allow relative resource lookups.
* The XSLT result object has a new method ``.write_output(file)`` that serialises output data into a file according to the ``xsl:output`` configuration.
Bugs fixed ----------
* GH#251: HTML comments were handled incorrectly by the soupparser. Patch by mozbugbox.
* LP#1654544: The html5parser no longer passes the ``useChardet`` option if the input is a Unicode string, unless explicitly requested. When parsing files, the default is to enable it when a URL or file path is passed (because the file is then opened in binary mode), and to disable it when reading from a file(-like) object.
Note: This is a backwards incompatible change of the default configuration. If your code parses byte strings/streams and depends on character detection, please pass the option ``guess_charset=True`` explicitly, which already worked in older lxml versions.
* LP#1703810: ``etree.fromstring()`` failed to parse UTF-32 data with BOM.
* LP#1526522: Some RelaxNG errors were not reported in the error log.
* LP#1567526: Empty and plain text input raised a TypeError in soupparser.
* LP#1710429: Uninitialised variable usage in HTML diff.
* LP#1415643: The closing tags context manager in ``xmlfile()`` could continue to output end tags even after writing failed with an exception.
* LP#1465357: ``xmlfile.write()`` now accepts and ignores None as input argument.
* Compilation under Py3.7-pre failed due to a modified function signature.
Other changes -------------
* The main module source files were renamed from ``lxml.*.pyx`` to plain ``*.pyx`` (e.g. ``etree.pyx``) to simplify their handling in the build process. Care was taken to keep the old header files as fallbacks for code that compiles against the public C-API of lxml, but it might still be worth validating that third-party code does not notice this change.