[Tutor] retrieve data from an online database without knowing the url

Yoram Hekma yoram.hekma at aoes.com
Thu Nov 15 11:13:35 CET 2007


On Thu, Nov 15, 2007 at 04:06:31AM +0100, pileux systeme wrote:
> Hello,
> 
> I was wondering whether it was possible to write a program which could directly write some word in a box and click 'search' on a typical online database without using the url. (e.g. is there a way to write a program which would write some keyword, say 'tomato' on google and click 'google search' automatically and copy the page without having to know the url 'http://www.google.com/search?hl=en&sa=X&oi=spell&resnum=0&ct=result&cd=1&q=tomato&spell=1')
> Thank you very much for any help you could provide,
> 
> N.
> 

Yes, this is possible. Have a look at urllib
(http://docs.python.org/lib/module-urllib.html) and within that page see
urlencode. The idea is that first you make a connection to google.com,
and then post the search values to the form. If you look at the form on
google.com (from source) you see the following:

<form action="/search" name="f">
    --snip--
    <input name="hl" value="nl" type="hidden">
    <input maxlength="2048" name="q" size="55" title="Google zoeken" value="">
    <input name="btnG" value="Google zoeken" type="submit"><input name="btnI" value="Ik doe een gok" type="submit">
    --snip--
</form>

For example:

import urllib
search_string = 'cars'
encoded_search_string = urllib.urlencode({'q': search_string})
reponse = urllib.urlopen('www.google.com', encoded_search_string)

To search for "cars"

Good luck!
-- 
Yoram Hekma
Unix Systems Administrator
CICT Department
AOES Netherlands B.V.
Haagse Schouwweg 6G
2332 KG Leiden, The Netherlands
Phone:  +31 (0)71 5795588
Fax:    +31 (0)71 5721277
e-mail: yoram.hekma at aoes.com
http://www.aoes.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mail.python.org/pipermail/tutor/attachments/20071115/1b7c2115/attachment.pgp 


More information about the Tutor mailing list