urllib accept-language doesn't have any effect

Philip Semanchuk philip at semanchuk.com
Thu Oct 16 15:55:45 CEST 2008


On Oct 16, 2008, at 6:50 AM, Martin Bachwerk wrote:

> Hmm, thanks for the ideas,
>
> I've checked the requests in Firefox one more time after deleting  
> all the cookies and both google.com and gizmodo.com do indeed  
> forward me to the German site without caring about the browser  
> settings.
>
> wget shows me that the server does a 302 redirect straight away..  
> soo..

I'm not sure what you mean by this. In my experiment with wget, Google  
respects the Accept-Language header. On other words, this returns a  
Swedish page even though I'm executing it from a U.S. IP address:

wget  "--header=Accept-Language: sv" http://www.google.com/


I see the same behavior from urllib2, although my code is slightly  
different from yours. Here's my code. If I use "sv" in the header I  
get Swedish, "pl" gives me Polish, etc.  I get the same result when I  
add your Mozilla user-agent string.

----------------------------------------
import urllib2

headers = { "Accept-Language" : "sv" }

req = urllib2.Request("http://www.google.com/", None, headers)
f = urllib2.urlopen(req)
content = f.read()
f.close()

print content
----------------------------------------


Do you get different results with this same code in Germany?

Cheers
Philip



>
>>
>> On Oct 15, 2008, at 9:50 AM, Martin Bachwerk wrote:
>>
>>> Hello,
>>>
>>> I'm trying to load a couple of pages using the urllib2 module. The  
>>> problem is that I live in Germany and some sites seem to look at  
>>> the IP of the client and forward him to a localized page.. Here's  
>>> an example of the code, how I want to access google.com main  
>>> english page, but get German instead. (For those of you who live  
>>> in US, you will probably get correct results.. try emulating with  
>>> 'fr' in accepted languages or something)
>>>
>>> opener = urllib2.build_opener()
>>> opener.addheaders = [('Host', 'www.google.com'), ('Accept- 
>>> Language','en-gb,en;q=0.5'), ('User-agent', 'Mozilla/5.0 (Windows;  
>>> U; Windows NT 5.1; en-GB; rv:1.9.0.1) Gecko/2008070208 Firefox/ 
>>> 3.0.1')]
>>> webfile = opener.open(url)
>>
>> Martin,
>> It looks to me like what you're sending is correct. Debugging  
>> suggestions --
>>
>> - Set up a Web server on 127.0.0.1 and see what that server  
>> receives when your Python code connects to it. Maybe you're not  
>> sending quite what you think.
>> - Try emulating your Python code with wget or a similar command  
>> line tool that lets you set headers.
>> - Sniff the conversation you're having with google using Wireshark.  
>> Maybe you're getting redirected by the remote server.
>>
>> Good luck
>> Philip
>>
>




More information about the Python-list mailing list