Simulating WWW button press

Jan Dries jdries at mail.com
Mon Jan 1 20:15:21 EST 2001


Lutz Schroeer wrote:
> I try to write a program which explores a WWW-site. Unfortunately not all
> links on this site are done by a standard "<a href=""></a>" construct but
> with a button:
> [...]
> <INPUT NAME="zipallpage" TYPE="submit" VALUE="Get all files">
> [...]
> How do I get my Python program to simulate the button being pressed to
> retrieve the data which hides behind the button?

What's important is not the <INPUT> tag, but the <FORM> tag, because it
tells you what happens when the button is pressed.
For example:

<FORM METHOD="get"
ACTION="http://www.somehost.com/somepath/somepage.html">
<INPUT type="text" name="field_1">
<INPUT type="radio" checked name="choice_1" value="0">
<INPUT type="radio" name="choice_1" value="1">
<INPUT NAME="zipallpage" TYPE="submit" VALUE="Get all files">
</FORM>

Suppose you enter "Some input" into the text edit field with name
"field_1", then the browser will, when you click on the "Get all
files"-button, do a GET to the server, in pretty much the same way as it
would in the case of a link. In fact, pressing the button would be
equivalent to clicking on the following link:

<a
href="http://www.somehost.com/somepath/somepage.html?field_1=Some+input&choice_1=0">Get
all files</a>

In other words, the values of the input fields are appended to the URL,
following a '?'. Some escaping is necessary (note the '+' instead of the
' ' in 'Some input'). The funtion quote_plus() in the library urllib can
be used to take care of this.
If METHOD in the FORM element has a value of "post" instead of "get",
things are a bit different. In that case the string
"field_1=Some+input&choice_1=0" is not appended to the URL, but instead
it is sent to the server as data of a post operation. 
It's not clear from your posting what exactly you want to do, nor how
you are trying to do it, but in any case, if you want to simulate the
behaviour of what a browser does when a user clicks on a link or on a
submit button, the above info plus the docs for urllib and httplib
should get you through it.

If you state more clearly what exactly you're trying to do, I could
perhaps be more specific as to how you can do it. 

Hope this answers your question,
Jan




More information about the Python-list mailing list