URL Character Decoding

Kirk McDonald mooquack at suad.org
Sun Jan 29 22:27:07 EST 2006


If you have a link such as, e.g.:

<a href="index.py?title=Main Menu">Main menu!</a>

The space will be translated to the character code '%20' when you later 
retrieve the GET data. Not knowing if there was a library function that 
would convert these back to their actual characters, I've written the 
following:

import re

def sub_func(m):
     return chr(int(m.group()[1:], 16))

def parse_title(title):
     p = re.compile(r'%[0-9][0-9]')
     return re.sub(p, sub_func, title)

(I know I could probably use a lambda function instead of sub_func, but 
I come to Python via C++ and am still not entirely used to them. This is 
clearer to me, at least.)

I guess what I'm asking is: Is there a library function (in Python or 
mod_python) that knows how to do this? Or, failing that, is there a 
different regex I could use to get rid of the substitution function?

-Kirk McDonald



More information about the Python-list mailing list