[ python-Bugs-649974 ] urllib.url2pathname('file://host/')...

SourceForge.net noreply at sourceforge.net
Tue Dec 28 01:32:24 CET 2004


Bugs item #649974, was opened at 2002-12-07 06:22
Message generated for change (Comment added) made by facundobatista
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=649974&group_id=5470

Category: Python Library
Group: Python 2.2.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Mike Brown (mike_j_brown)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib.url2pathname('file://host/')...

Initial Comment:
The Unix version of urllib.url2pathname(), when given a 
file URL that contains a host part, returns a path with 
the host embedded in the URL, despite the fact that 
there is no convention for mapping the host into the 
URL. The resulting path is not usable.

For example, on Windows, there is a convention for 
mapping the host part of a URL to and from a NetBIOS 
name. url2pathname('//somehost/path/to/file') returns 
r'\somehost\path\to\file' which is safe to pass into open
() or os.access().

But on Unix, there is no such convention. url2pathname
('//somehost/path/to/file') returns '//somehost/path/to/file', 
which means the same thing as '/somehost/path/to/file' --
 somehost is just another path segment and does not 
actually designate a host.

In my opinion, an exception should be raised in this 
situation; url2pathname() should not try to produce an 
OS path for a remote machine when there is no 
convention for referencing a remote machine in that OS's 
traditional path format. This way, if no exception is 
raised, you know that it's safe to pass the result into 
open() or os.access().

And as noted in other bug reports, 'file://localhost/' is a 
special case that should be treated the same as 'file:///'.

----------------------------------------------------------------------

>Comment By: Facundo Batista (facundobatista)
Date: 2004-12-27 21:32

Message:
Logged In: YES 
user_id=752496

The documentation for urllib states that:

Although the urllib module contains (undocumented) routines
to parse and unparse URL strings, the recommended interface
for URL manipulation is in module urlparse.

So, if you think that the files should also be modified,
change the group of this bug to 2.4. Otherwise it will be
closed as won't fix.

----------------------------------------------------------------------

Comment By: Mike Brown (mike_j_brown)
Date: 2004-12-27 04:15

Message:
Logged In: YES 
user_id=371366

See also #649961, where I propose the same solution.

----------------------------------------------------------------------

Comment By: Mike Brown (mike_j_brown)
Date: 2004-12-27 04:04

Message:
Logged In: YES 
user_id=371366

pathname2url and url2pathname are undocumented and are
urllib- and platform-specific. My complaints in this old bug
report are based on assumptions that thse functions are
general-purpose public interfaces. Upon further
investigation, I see that they are not.

I suggest leaving the implementations unchanged for now;
there are too many issues with doing it 'right' to go into
here. But perhaps add documentation that is consistent and
indicates that the functions are limited in scope. Patches
attached.


----------------------------------------------------------------------

Comment By: Facundo Batista (facundobatista)
Date: 2004-12-26 11:52

Message:
Logged In: YES 
user_id=752496

Could you please provide a test case?

----------------------------------------------------------------------

Comment By: Facundo Batista (facundobatista)
Date: 2004-12-26 11:52

Message:
Logged In: YES 
user_id=752496

Please, could you verify if this problem persists in Python 2.3.4
or 2.4?

If yes, in which version? Can you provide a test case?

If the problem is solved, from which version?

Note that if you fail to answer in one month, I'll close this bug
as "Won't fix".

Thank you! 

.    Facundo

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2002-12-11 03:56

Message:
Logged In: YES 
user_id=250749

There is a sort of convention in Unix - 
  somehost:/path/to/file
which comes from NFS, but has been used by tar (for remote 
tapes via rsh) and ssh's scp, and I believe has been used by 
some ftp clients (ncftp?)

However as far as I know you can't pass such a path to open
() or os.access(), so your basic point still has validity.

----------------------------------------------------------------------

Comment By: Mike Brown (mike_j_brown)
Date: 2002-12-07 06:24

Message:
Logged In: YES 
user_id=371366

by 'host embedded in the URL' in the first sentence I 
meant 'host embedded in it' [the path]

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=649974&group_id=5470


More information about the Python-bugs-list mailing list