Using Python 2.1 to download asp www pages

Zugz zugz.public at DEL-ete-MEbtinternet.com
Sun Jan 6 21:17:33 CET 2002


Hi,

I've recently written some Python code to extract some details about posting
frequency etc from a board I use regularly.

I used IE5.5's Save As to give me some pages to work on offline.

I would now like to automate the whole process by downloading all the
relevant pages or maybe even just accessing them direct.

If I use urlopen on a regular .htm page, in this case from the collection of
links I call my www site, then things work as you would expect. You get the
html source:

>>>
a=urllib.urlopen("http://www.zugz.btinternet.co.uk/NonSFBooksBookshops.htm")
>>> print a.read()

as you would hope.

However if I access one of the pages of interest, which all have the same
form as below but with the a varying last page number:

>>>
a=urllib.urlopen("http://boards.gamers.com/messages/overview.asp?name=panthe
r_xl&page=2")
>>> print a.read()

Then you do not get the page source but some HTML about the page being
moved.

So is this a function of it being an asp page and my luck is out or is there
a simple way to achieve what I wish anyway.

Thanks in advance for any help you may be able to give.

Regards,
Zugz.


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.310 / Virus Database: 171 - Release Date: 19/12/01





More information about the Python-list mailing list