[Tutor] can I walk or glob a website?

Albert-Jan Roskam fomcl at yahoo.com
Wed May 18 11:06:07 CEST 2011


Hello,

How can I walk (as in os.walk) or glob a website? I want to download all the 
pdfs from a website (using urllib.urlretrieve), extract certain figures (using 
pypdf- is this flexible enough?) and make some statistics/graphs from those 
figures (using rpy and R). I forgot what the process of 'automatically 
downloading' is called again, something that sounds like 'whacking' (??)

 Cheers!!
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have the 
Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20110518/a25de4ce/attachment.html>


More information about the Tutor mailing list