[Python-Dev] Path object design
Steve Holden
steve at holdenweb.com
Fri Nov 3 19:38:21 CET 2006
Andrew Dalke wrote:
> glyph:
>
>>Path manipulation:
>>
>> * This is confusing as heck:
>> >>> os.path.join("hello", "/world")
>> '/world'
>> >>> os.path.join("hello", "slash/world")
>> 'hello/slash/world'
>> >>> os.path.join("hello", "slash//world")
>> 'hello/slash//world'
>> Trying to formulate a general rule for what the arguments to os.path.join
>>are supposed to be is really hard. I can't really figure out what it would
>>be like on a non-POSIX/non-win32 platform.
>
>
> Made trickier by the similar yet different behaviour of urlparse.urljoin.
>
> >>> import urlparse
> >>> urlparse.urljoin("hello", "/world")
> '/world'
> >>> urlparse.urljoin("hello", "slash/world")
> 'slash/world'
> >>> urlparse.urljoin("hello", "slash//world")
> 'slash//world'
> >>>
>
> It does not make sense to me that these should be different.
>
Although the last two smell like bugs, the point of urljoin is to make
an absolute URL from an absolute ("current page") URL and a relative
(link) one. As we see:
>>> urljoin("/hello", "slash/world")
'/slash/world'
and
>>> urljoin("http://localhost/hello", "slash/world")
'http://localhost/slash/world'
but
>>> urljoin("http://localhost/hello/", "slash/world")
'http://localhost/hello/slash/world'
>>> urljoin("http://localhost/hello/index.html", "slash/world")
'http://localhost/hello/slash/world'
>>>
I think we can probably conclude that this is what's supposed to happen.
In the case of urljoin the first argument is interpreted as referencing
an existing resource and the second as a link such as might appear in
that resource.
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden
More information about the Python-Dev
mailing list