URLs and ampersands
Wojtek Walczak
gminick at nie.ma.takiego.adresu.w.sieci.pl
Tue Aug 5 07:01:12 EDT 2008
Dnia 05 Aug 2008 09:59:20 GMT, Steven D'Aprano napisa³(a):
> I didn't say it urlretrieve was escaping the URL. I actually think the
> URLs are pre-escaped when I scrape them from a HTML file. I have searched
> for, but been unable to find, standard library functions that escapes or
> unescapes URLs. Are there any such functions?
$ cd /usr/lib/python2.5/
$ grep "\&\;" *.py
BaseHTTPServer.py: return html.replace("&", "&").replace("<",
"<").replace(">", ">")
cgi.py: s = s.replace("&", "&") # Must be done first!
cgitb.py: doc = doc.replace('&', '&').replace('<',
'<')
difflib.py:
text=text.replace("&","&").replace(">",">").replace("<","<")
HTMLParser.py: s = s.replace("&", "&") # Must be last
pydoc.py: return replace(text, '&', '&', '<', '<', '>',
'>')
xmlrpclib.py: s = replace(s, "&", "&")
So it could be BaseHTTPServer, cgi, cgitb, difflib, HTMLParser,
pydoc or xmlrpclib. Do you use any of these? Or maybe some other
external module?
--
Regards,
Wojtek Walczak,
http://www.stud.umk.pl/~wojtekwa/
More information about the Python-list
mailing list