[Tutor] Retrieving data from a web site

Peter Otten __peter__ at web.de
Sun May 19 10:05:57 CEST 2013


Phil wrote:

> My apatite having been whetted I'm now stymied because of a Ubuntu
> dependency problem during the installation of urllib3. This is listed as
> a bug. Has anyone overcome this problem?
> 
> Perhaps there's another library that I can use to download data from a
> web page?

You mean you are using Python 3? The replacement for urllib2 in Python 3 is 
urllib.request and a few others. There is a tool called 2to3 that can help 
you with the transition.

The original Python 2 code:

 $ cat parse.py 
import urllib2
import json

url = "http://*********/goldencasket"
s = urllib2.urlopen(url).read()

s = s.partition("latestResults_productResults")[2].lstrip(" =")
s = s.partition(";")[0]
data = json.loads(s)
lotto = data["GoldLottoSaturday"]
print lotto["drawDayDateNumber"]
print map(int, lotto["primaryNumbers"])
print map(int, lotto["secondaryNumbers"])
$ python parse.py 
Sat 18/May/13, Draw 3321
[14, 31, 16, 25, 6, 3]
[9, 35]

Now let's apply 2to3 (I'm using the version that comes with python3.3).
The -w option tells the script to overwrite the original source:

$ 2to3-3.3 parse.py -w
[noisy output omitted]

The script now looks like this:

$ cat parse.py
import urllib.request, urllib.error, urllib.parse
import json

url = "http://*********/goldencasket"
s = urllib.request.urlopen(url).read()

s = s.partition("latestResults_productResults")[2].lstrip(" =")
s = s.partition(";")[0]
data = json.loads(s)
lotto = data["GoldLottoSaturday"]
print(lotto["drawDayDateNumber"])
print(list(map(int, lotto["primaryNumbers"])))
print(list(map(int, lotto["secondaryNumbers"])))
$ python3.3 parse.py
Traceback (most recent call last):
  File "parse.py", line 7, in <module>
    s = s.partition("latestResults_productResults")[2].lstrip(" =")
TypeError: expected bytes, bytearray or buffer compatible object

After manually changing the line

s = urllib.request.urlopen(url).read()

to

s = urllib.request.urlopen(url).read().decode()

$ python3.3 parse.py
Sat 18/May/13, Draw 3321
[14, 31, 16, 25, 6, 3]
[9, 35]
$ 



More information about the Tutor mailing list