UnicodeEncodeError - opening encoded URLs

Matt Nordhoff mnordhoff at mattnordhoff.com
Fri Mar 27 18:25:56 EDT 2009


D4rko wrote:
> Hi!
> 
> I have a problem with urllib2 open() function. My application is
> receiving the following request - as I can see in the developement
> server console it is properly encoded:
> 
> [27/Mar/2009 22:22:29] "GET /[blahblah]/Europa_%C5%9Arodkowa/5 HTTP/
> 1.1" 500 54572
> 
> Then it uses this request parameter as name variable to build
> wikipedia link, and tries to acces it with following code:
> 
> 	url = u'http://pl.wikipedia.org/w/index.php?title=' + name +
> '&printable=yes'
> 	opener = urllib2.build_opener()
> 	opener.addheaders = [('User-agent', 'Mozilla/5.0')]
> 	wikipage = opener.open(url)
> 
> Unfortunately, the last line fails with the exception:
> UnicodeEncodeError 'ascii' codec can't encode character u'\u015a' in
> position 30: ordinal not in range(128).  Using urlencode(url) results
> in TypeError "not a valid non-string sequence or mapping object", and
> quote(url)  fails because of KeyError u'\u015a' . How can I properly
> parse this request to make it work (ie. acces
> http://pl.wikipedia.org/wiki/Europa_%C5%9Arodkowa)?

What if you just used a regular byte string for the URL?

>>> url = 'http://pl.wikipedia.org/w/index.php?title=' + name +
'&printable=yes'

(Unless "name" is a unicode object as well.)

(Nice user-agent, BTW. :-P )
-- 



More information about the Python-list mailing list