[lxml-dev] another bug
Hi, I have many sites on test. a parking domain, import lxml.html hparser = lxml.html.HTMLParser(encoding='utf-8', remove_comments=True) content="""<frameset> <frame src="main.php" name="srcpg" id="srcpg" frameborder="0" scrolling="Auto" marginwidth="" marginheight="0"> </frameset>""" etree_document = lxml.html.fromstring(content, parser=hparser) TypeError Traceback (most recent call last) /home/sergio/<ipython console> in <module>() /usr/lib/python2.6/site-packages/lxml/html/__init__.pyc in fromstring(html, base_url, parser, **kw) 634 other_head.drop_tree() 635 return doc --> 636 if (len(body) == 1 and (not body.text or not body.text.strip()) 637 and (not body[-1].tail or not body[-1].tail.strip())): 638 # The body has just one element, so it was probably a single TypeError: object of type 'NoneType' has no len() thanks, -- Sérgio M. B.
Hi, note that subjects like "another bug" are less likely to receive interest than something that describes the actual problem. Sergio Monteiro Basto, 02.06.2010 17:34:
Hi, I have many sites on test.
a parking domain,
import lxml.html hparser = lxml.html.HTMLParser(encoding='utf-8', remove_comments=True)
content="""<frameset>
<frame src="main.php" name="srcpg" id="srcpg" frameborder="0" scrolling="Auto" marginwidth="" marginheight="0">
</frameset>"""
etree_document = lxml.html.fromstring(content, parser=hparser) TypeError Traceback (most recent call last)
/home/sergio/<ipython console> in<module>()
/usr/lib/python2.6/site-packages/lxml/html/__init__.pyc in fromstring(html, base_url, parser, **kw) 634 other_head.drop_tree() 635 return doc --> 636 if (len(body) == 1 and (not body.text or not body.text.strip()) 637 and (not body[-1].tail or not body[-1].tail.strip())): 638 # The body has just one element, so it was probably a single
TypeError: object of type 'NoneType' has no len()
Yes, the exception is a bug. I'm not sure what the parser should return in this case. I'll have to look into this, maybe it's worth special casing. Could you file a bug report? Thanks! Stefan
participants (2)
-
Sergio Monteiro Basto
-
Stefan Behnel