[Tutor] Documentation

Alexander Mark rhettnaxel at gmail.com
Mon Jan 7 04:57:37 CET 2013


On Jan 6, 2013, at 22:48, Ed Owens <eowens0124 at gmx.com> wrote:

> I have been working my way through Chun's book Core Python Applications.
> 
> In chapter 9 he has a web crawler program that essentially copies all the files from a web site by finding and downloading the links on that domain.
> 
> One of the classes has a procedure definition, and I'm having trouble finding documentation for the functions.  The code is:
> 
>  def parse_links(self):
>         'Parse out the links found in downloaded HTML file'
>         f = open(self.file, 'r')
>         data = f.read()
>         f.close()
>         parser = HTMLParser(formatter.AbstractFormatter(
>                 formatter.DumbWriter(cStringIO.StringIO())))
>         parser.feed(data)
>         parser.close()
>         return parser.anchorlist
> 
> HTMLParser is from htmllib.
> 
> I'm having trouble finding clear documentation for what the functions that are on the 'parser =' line do and return.  The three modules (htmllib, formatter, & cStringIO are all imported, but I can't seem to find much info on how they work and what they do.  What this actually does and what it produces is completely obscure to me. 
> 
> Any help would be appreciated. Any links to clear documentation and examples?
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

Hi Ed, maybe this helps:
http://docs.python.org/2/library/htmllib.html
A
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20130106/37b9aa2a/attachment.html>


More information about the Tutor mailing list