Unicode and UrlEncode!
John
johng2001 at rediffmail.com
Wed Mar 17 18:12:55 EST 2004
I am trying to translate some French via Google.
Here is my code.
# -*- coding: latin-1 -*-
import httplib, urllib, re
def translate_french_to_english(french):
params = urllib.urlencode( {'text': french, 'langpair': 'fr|en',
'hl':'en', 'ie':'UTF8', 'oe':'UTF8'} )
print params
headers = {"Content-type": "application/x-www-form-urlencoded",
"Accept": "text/plain"}
Cn = httplib.HTTPConnection("translate.google.com")
Cn.request("POST", "/translate_t", params, headers)
response = Cn.getresponse()
data = response.read()
Cn.close()
match = re.compile('<textarea name=q
.*?>(.*?)</textarea>').search(data)
translated = match.groups()[0]
return translated
print translate_french_to_english('sÈdimentation')
The problem is my params in encoding to
langpair=fr%7Cen&text=s%C8dimentation&oe=UTF8&ie=UTF8&hl=en
-- s%C8dimentation --
when it should encode to
text=s%C3%88dimentation&langpair=fr%7Cen&hl=en&ie=UTF8&oe=UTF8
-- s%C3%88dimentation --
What should I do?
More information about the Python-list
mailing list