URLs and ampersands

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Tue Aug 5 04:16:46 CEST 2008


En Mon, 04 Aug 2008 20:43:45 -0300, Steven D'Aprano  
<steve at REMOVE-THIS-cybersource.com.au> escribi�:

> I'm using urllib.urlretrieve() to download HTML pages, and I've hit a
> snag with URLs containing ampersands:
>
> http://www.example.com/parrot.php?x=1&y=2
>
> Somewhere in the process, urls like the above are escaped to:
>
> http://www.example.com/parrot.php?x=1&amp;y=2
>
> which naturally fails to exist.
>
> I could just do a string replace, but is there a "right" way to escape
> and unescape URLs? I've looked through the standard lib, but I can't find
> anything helpful.

This works fine for me:

py> import urllib
py> fn =  
urllib.urlretrieve("http://c7.amazingcounters.com/counter.php?i=1516903
&c=4551022")[0]
py> open(fn,"rb").read()
'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00...

So it's not urlretrieve escaping the url, but something else in your  
code...

-- 
Gabriel Genellina




More information about the Python-list mailing list