URL Character Decoding
Kirk McDonald
mooquack at suad.org
Sun Jan 29 22:38:34 EST 2006
Kirk McDonald wrote:
> If you have a link such as, e.g.:
>
> <a href="index.py?title=Main Menu">Main menu!</a>
>
> The space will be translated to the character code '%20' when you later
> retrieve the GET data. Not knowing if there was a library function that
> would convert these back to their actual characters, I've written the
> following:
>
> import re
>
> def sub_func(m):
> return chr(int(m.group()[1:], 16))
>
> def parse_title(title):
> p = re.compile(r'%[0-9][0-9]')
> return re.sub(p, sub_func, title)
>
> (I know I could probably use a lambda function instead of sub_func, but
> I come to Python via C++ and am still not entirely used to them. This is
> clearer to me, at least.)
>
> I guess what I'm asking is: Is there a library function (in Python or
> mod_python) that knows how to do this? Or, failing that, is there a
> different regex I could use to get rid of the substitution function?
>
> -Kirk McDonald
Actually, I just noticed this doesn't really work at all. The URL
character codes are in hex, so not only does the regex not match what it
should, but sub_func fails miserably. See why I wanted a library function?
-Kirk McDonald
More information about the Python-list
mailing list