[Moin-user] Problem with encoding URLS

Paul Boddie paul at boddie.org.uk
Mon Nov 30 10:49:39 EST 2015


On Monday 30. November 2015 15.33.49 ml at mherrn.de wrote:
> Hi,
> 
> I have just moved my wiki to another server. Unfortunately, URLs with
> special characters (for example german Umlauts) don't work anymore.
> 
> I have a page with the Name "Töst". It contains the german Umlaut "ö".
> The encoded url looks like "https://mywiki.com/T%C3%B6st"

So, that's the page name URL-encoded with the original character values being 
represented using UTF-8. In short...

Töst
-> 84 (T), 195, 182, 115 (s), 116 (t) (decimal values)
   54 (T), C3,  B6,  73 (s), 74 (t) (hex values)
-> T%C3%B6st

Such encoding is perfectly reasonable, since the W3C never got round to 
specifying non-ASCII characters in URLs, or at least not properly.

> When trying to access this page the apache log tells me:
> 
> -----/-----
> mod_wsgi (pid=31988): Exception occurred processing WSGI script
> '/home/user/www/moin/moin.wsgi'.
> Traceback (most recent call last):
>    File "/usr/lib/python2.7/dist-packages/werkzeug/wsgi.py", line 567, in
> __call__
>      cleaned_path = cleaned_path.encode(sys.getfilesystemencoding())
> UnicodeEncodeError: 'ascii' codec can't encode character u'\\xf6' in
> position 2: ordinal not in range(128)
> -----/-----
> 
> What is going on here? Do I have a misconfiguration in moinmoin? Why is it
> trying to encode it in ASCII?

Here, I imagine that your locale setting isn't helping. What do you get at the 
Python prompt if you call sys.getfilesystemencoding() ?

You may need to look at your system's default locale and/or the user's locale, 
I guess. I see someone else has experienced this recently, too:

https://moinmo.in/MoinMoinBugs/1.9.8NonAsciiURL-UnicodeEncodeError

Paul




More information about the Moin-user mailing list