urllib, can't seem to get form post right

Jon Clements joncle at googlemail.com
Thu Sep 24 17:46:30 EDT 2009


On 24 Sep, 22:18, "Adam W." <awasile... at gmail.com> wrote:
> I'm trying to scrape some historical data from NOAA's website, but I
> can't seem to feed it the right form values to get the data out of
> it.  Heres the code:
>
> import urllib
> import urllib2
>
> ## The source pagehttp://www.erh.noaa.gov/bgm/climate/bgm.shtml
> url = 'http://www.erh.noaa.gov/bgm/climate/pick.php'
> values = {'month' : 'July',
>           'year' : '1988'}
>
> user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
> headers = { 'User-Agent' : user_agent }
>
> data = urllib.urlencode(values)
> req = urllib2.Request(url, data, headers)
> response = urllib2.urlopen(req)
> the_page = response.read()
> print the_page

Hint:

   <select name="month">
     <option value="/jan">January</option>

     <option value="/feb">February</option>
     <option value="/mar">March</option>
     <option value="/apr">April</option>
     <option value="/may">May</option>
     <option value="/jun">June</option>
     <option value="/jul">July</option>

     <option value="/aug">August</option>
     <option value="/sep">September</option>
     <option value="/oct">October</option>
     <option value="/nov">November</option>
     <option value="/dec">December</option>
   </select>

Jon.



More information about the Python-list mailing list