[Tutor] Documentation

Steven D'Aprano steve at pearwood.info
Wed Jan 9 15:37:27 CET 2013


On 07/01/13 14:48, Ed Owens wrote:

[...]
> parser = HTMLParser(formatter.AbstractFormatter(
>     formatter.DumbWriter(cStringIO.StringIO())))

> HTMLParser is from htmllib.
>
> I'm having trouble finding clear documentation for what the functions
>that are on the 'parser =' line do and return. The three modules
>(htmllib, formatter, & cStringIO are all imported, but I can't seem to
> find much info on how they work and what they do. What this actually
>does and what it produces is completely obscure to me.
>
> Any help would be appreciated. Any links to clear documentation and
>examples?

I'll start with the easiest: cStringIO, and it's slower cousin StringIO,
are modules for creating fake in-memory file-like objects. Basically they
create an object that holds a string in memory but behaves as if it were
a file object.

http://docs.python.org/2/library/stringio.html

The formatter module is used to produce an object that you can use for
creating formatted text. It's quite abstract, and to be honest I have
never used it and don't know how it works.

http://docs.python.org/2/library/formatter.html

The htmllib module is a library for parsing HTML code. Essentially, you
use it to read the content of webpages (.html files).

http://docs.python.org/2/library/htmllib.html

Unfortunately, there is not a lot of documentation for the
htmllib.HTMLParser object, so I can't help you with that.



-- 
Steven


More information about the Tutor mailing list