data:image/s3,"s3://crabby-images/cbbce/cbbced8c47f7bfb197ed1a768a6942977c050e7c" alt=""
Responding to a question in python-help about extracting links from web pages, I wrote a simple href printer: import htmllib, formatter class MyParser(htmllib.HTMLParser): def anchor_bgn(self, href, name, type): print href fmt = formatter.NullFormatter() parser = MyParser(fmt, verbose=1) parser.feed(open("tour01.html").read()) parser.close() When run using 2.2a4, it never prints anything. It outputs a list of hrefs when run with 2.1 or 1.6. Either there's a bug somewhere (in my code possibly, though it's pretty simple) or some semantics changed that I missed. I thought maybe the method resolution order change affected things, but htmllib.HTMLParser only uses single inheritance. When displaying help about htmllib.HTMLParser, pydoc.help does emit the method resolution order, which it doesn't generally seem to do: class HTMLParser(sgmllib.SGMLParser) | Method resolution order: | HTMLParser | sgmllib.SGMLParser | markupbase.ParserBase ... Skip
data:image/s3,"s3://crabby-images/0887d/0887d92e8620e0d2e36267115257e0acf53206d2" alt=""
Skip Montanaro writes:
When run using 2.2a4, it never prints anything. It outputs a list of hrefs when run with 2.1 or 1.6. Either there's a bug somewhere (in my code possibly, though it's pretty simple) or some semantics changed that I
Sounds like a bug to me. Please file a bug report on SF including your code, and assign to me. Thanks! -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
data:image/s3,"s3://crabby-images/1b296/1b296e86cd8b01ddca1413c4cc5ae7c186edc52a" alt=""
[Skip Montanaro]
Responding to a question in python-help about extracting links from web pages, I wrote a simple href printer:
import htmllib, formatter
class MyParser(htmllib.HTMLParser): def anchor_bgn(self, href, name, type): print href
fmt = formatter.NullFormatter() parser = MyParser(fmt, verbose=1) parser.feed(open("tour01.html").read()) parser.close()
When run using 2.2a4, it never prints anything. It outputs a list of hrefs when run with 2.1 or 1.6. Either there's a bug somewhere (in my code possibly, though it's pretty simple) or some semantics changed that I missed.
Sorry, I don't know anything about that, and I don't know that code. Open a bug report! Sure doesn't sound right to me.
I thought maybe the method resolution order change affected things,
The MRO hasn't changed for classic classes. Only for new classes (so if you don't derive from object, nothing about MRO changed).
but htmllib.HTMLParser only uses single inheritance. When displaying help about htmllib.HTMLParser, pydoc.help does emit the method resolution order, which it doesn't generally seem to do:
I recently changed pydoc to display MRO if and only if there are more than two classes an attribute *could* come from (if there are no more than two classe involved, there's no possibility of confusion; but if there are more than two, confusion is possible).
class HTMLParser(sgmllib.SGMLParser) | Method resolution order: | HTMLParser | sgmllib.SGMLParser | markupbase.ParserBase ...
It listed MRO simply because more than 2 classes are possible attribute sources.
participants (3)
-
Fred L. Drake, Jr.
-
Skip Montanaro
-
Tim Peters