Well perhaps I am going crazy (or making a silly mistake?), but I tested this on my Linux box with Lxml version 3.0.0 through to the latest sources on github with the same results. Here is the test script I used:
from lxml.html.diff import htmldiff
html = u"""<pre>test
test2
test3</pre>"""
result = htmldiff(html, html)
assert "\n" in result, "Newline not found in %s" % result
Inside htmldiff it calls tokenize() on both inputs, which appears to loose all notion of whitespace:
>>> tokenize("""<pre>test
... test2
... test3
... """)
[token(u'test', ['<pre>'], []), token(u'test2', [], []), token(u'test3', [], ['</pre>'])]
It then calls htmldiff_tokens, which produces output like this:
['<pre>', u'test ', u'test2 ', u'test3', '</pre>']
It then joins those with an empty string, which results in a loss of whitespace.