UnicodeDecodeError having fetch web page

Philip Semanchuk philip at semanchuk.com
Tue May 25 15:39:03 EDT 2010


On May 25, 2010, at 3:13 PM, Barry wrote:

> Hi,
>
> The code below is giving me the error:
>
> Traceback (most recent call last):
>  File "C:\Users\Administratör\Desktop\test.py", line 4, in <module>
> UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position 1:
> unexpected code byte
>
>
> What am i doing wrong?
>
> Thanks,
>
> Barry
>
> request = urllib.request.Request(url='http://en.wiktionary.org/wiki/
> baby',headers={'User-Agent':'Mozilla/5.0 (X11; U; Linux i686) Gecko/
> 20071127 Firefox/2.0.0.11'} )
>
> response = urllib.request.urlopen(request)
> html = response.read().decode('utf-8')


Well, for starters you're assuming that the response content is in  
UTF-8. You need to examine the Content-Type header to see what the  
encoding is. If it's not UTF-8, there's your problem.


HTH
P




More information about the Python-list mailing list