urllib interpretation of URL with ".."
Gabriel Genellina
gagsl-py2 at yahoo.com.ar
Tue Jun 26 21:19:25 EDT 2007
En Tue, 26 Jun 2007 17:26:06 -0300, sergio <sergio at sergiomb.no-ip.org>
escribió:
> John Nagle wrote:
>
>> In Python, of course, "urlparse.urlparse", which is
>> the main function used to disassemble a URL, has no idea whether it's
>> being used by a client or a server, so it, reasonably enough, takes
>> option
>> 1.
>
>>>> import urlparse
>>>> base="http://somesite.com/level1/"
>>>> path="../page.html"
>>>> urlparse.urljoin(base,path)
> 'http://somesite.com/page.html'
>>>> base="http://somesite.com/"
>>>> urlparse.urljoin(base,path)
> 'http://somesite.com/../page.html'
>
> For me this is a bug and is very annoying because I can't simply trip ../
> from path because base could have a level.
I'd say it's an annoyance, not a bug. Write your own urljoin function with
your exact desired behavior - since all "meaningful" .. and . should have
been already processed by urljoin, a simple url =
url.replace("/../","/").replace("/./","/") may be enough.
--
Gabriel Genellina
More information about the Python-list
mailing list