UnicodeDecodeError having fetch web page

Rob Williscroft rtw at rtw.me.uk
Wed May 26 14:10:15 EDT 2010


Kushal Kumaran wrote in news:1274889564.2339.16.camel at nitrogen in
gmane.comp.python.general: 

> On Tue, 2010-05-25 at 20:12 +0000, Rob Williscroft wrote:
>> Barry wrote in news:83dc485a-5a20-403b-99ee-c8c627bdbab3
>> @m21g2000vbr.googlegroups.com in gmane.comp.python.general:
>> 
>> > Hi,
>> > 
>> > The code below is giving me the error:
>> > 
>> > Traceback (most recent call last):
>> >   File "C:\Users\Administratör\Desktop\test.py", line 4, in
>> >   <module> 
>> > UnicodeDecodeError: 'utf8' codec can't decode byte 0x8b in position
>> > 1: unexpected code byte
>> > 
>> > 
>> > What am i doing wrong?
>> 
>> It may not be you, en.wiktionary.org is sending gzip 
>> encoded content back, it seems to do this even if you set
>> the Accept header as in:
>> 
>> request.add_header( "Accept", "text/html" )
>> 
>> But maybe I'm not doing it correctly.
>> 
> You need the Accept-Encoding: identity header.
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html

Thanks, following this I did change the line to be:

request.add_header( "Accept-Encoding", "identity" )

but it made no difference to en.wiktionary.org it just sent the
back a gzip encoded response.

Rob.




More information about the Python-list mailing list