Web automation (was: Pressing a Webpage Button)

qwweeeit at yahoo.it qwweeeit at yahoo.it
Thu Nov 3 13:00:07 EST 2005


Hi all,
Elliot Temple on  the  1 June wrote:
> How do I make Python press a button on a webpage?  I looked at
> urllib, but I only see how to open a URL with that.  I searched
>  google but no luck.

> For example, google has a button   <input type=submit value="Google
> Search" name=btnG>  how would i make a script to press that button?

I have a similar target: web automation, which
needs not only to press web buttons but also
fill up input fields (like in your case the
search field of Google).
On the suggestion of gmi... at gmail.com, I tried
twill (http://www.idyll.org/~t/www-tools/twill.html).

I think it can be the solution:
I already applied it for reading data from an asp file
(http://groups.google.it/group/comp.lang.python/browse_frm/thread/23b50b2c5f2ef377/2e0a593e08d28baf?q=qwweeeit&rnum=2#2e0a593e08d28baf)
I try to solve your problem using the interactive mode
(but twill can also be called as a module).

Set twill in interactive mode: twill-sh
- load Google webpage:
     go www.google.it (I'm Italian!)
- show the page with the command 'show'
- Get the page forms:
     showforms

## __Name______ __Type___ __ID________ __Value__________________
   hl           hidden    (None)       it
   ie           hidden    (None)       ISO-8859-1
   q            text      (None)
   meta         radio     all          [''] of ['', 'lr=lang_it',
'cr=count ...
1  btnG         submit    (None)       Cerca con Google
2  btnI         submit    (None)       Mi sento fortunato
current page: http://www.google.it

The input field is q (Type:text), while there are two buttons
(Type: submit) and a radio button meta (Type: radio).

- fill values:
     fv 0 q twill
  (being "twill" the search string")
- press the search button:
     fv 1  btnG "Cerca con Google"
     submit
  twill answers with the query to Google:
http://www.google.it/search?hl=it&ie=ISO-8859-1&q=twill&btnG=Cerca+con+Google&meta=
- save the search result on a file:
     save_html /home/qwweeeit/searching_twill.html

Here they are the 1st 10 hits of the search!
Don't ask me to continue! Perhaps asking to the author of twill
(C. Titus Brown)...

With such a method you can bypass the Google's restrictions, because
you are using the browser (only building automatically the query).

And this answers to the right observation of Grant Edwards:
> Ah, never mind.  That doesn't work.  Google somehow detects
> you're not sending the query from a browser and bonks you. 
 
Bye.




More information about the Python-list mailing list