[ python-Bugs-649974 ] urllib.url2pathname, pathname2url doc strings inconsistent

SourceForge.net noreply at sourceforge.net
Mon Dec 26 23:54:17 CET 2005


Bugs item #649974, was opened at 2002-12-07 10:22
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=649974&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Documentation
Group: Python 2.4
>Status: Closed
>Resolution: Accepted
Priority: 3
Submitted By: Mike Brown (mike_j_brown)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib.url2pathname, pathname2url doc strings inconsistent

Initial Comment:
The Unix version of urllib.url2pathname(), when given a 
file URL that contains a host part, returns a path with 
the host embedded in the URL, despite the fact that 
there is no convention for mapping the host into the 
URL. The resulting path is not usable.

For example, on Windows, there is a convention for 
mapping the host part of a URL to and from a NetBIOS 
name. url2pathname('//somehost/path/to/file') returns 
r'\\somehost\path\to\file' which is safe to pass into open
() or os.access().

But on Unix, there is no such convention. url2pathname
('//somehost/path/to/file') returns '//somehost/path/to/file', 
which means the same thing as '/somehost/path/to/file' --
 somehost is just another path segment and does not 
actually designate a host.

In my opinion, an exception should be raised in this 
situation; url2pathname() should not try to produce an 
OS path for a remote machine when there is no 
convention for referencing a remote machine in that OS's 
traditional path format. This way, if no exception is 
raised, you know that it's safe to pass the result into 
open() or os.access().

And as noted in other bug reports, 'file://localhost/' is a 
special case that should be treated the same as 'file:///'.

----------------------------------------------------------------------

>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-12-26 23:54

Message:
Logged In: YES 
user_id=1188172

Applied patches in revisions 41816,41817.

----------------------------------------------------------------------

Comment By: Mike Brown (mike_j_brown)
Date: 2004-12-28 06:18

Message:
Logged In: YES 
user_id=371366

OK. I changed the group to Python 2.4, changed the category
to Documentation, changed the summary, and lowered the priority.

Since there are doc strings for the non-posix versions of
url2pathname() and pathname2url(), please just consider the
patches I created to be just making all of the docs
consistent among each other and consistent with the
module-level docs you pointed out.

Thanks! -Mike

----------------------------------------------------------------------

Comment By: Facundo Batista (facundobatista)
Date: 2004-12-28 01:32

Message:
Logged In: YES 
user_id=752496

The documentation for urllib states that:

Although the urllib module contains (undocumented) routines
to parse and unparse URL strings, the recommended interface
for URL manipulation is in module urlparse.

So, if you think that the files should also be modified,
change the group of this bug to 2.4. Otherwise it will be
closed as won't fix.

----------------------------------------------------------------------

Comment By: Mike Brown (mike_j_brown)
Date: 2004-12-27 08:15

Message:
Logged In: YES 
user_id=371366

See also #649961, where I propose the same solution.

----------------------------------------------------------------------

Comment By: Mike Brown (mike_j_brown)
Date: 2004-12-27 08:04

Message:
Logged In: YES 
user_id=371366

pathname2url and url2pathname are undocumented and are
urllib- and platform-specific. My complaints in this old bug
report are based on assumptions that thse functions are
general-purpose public interfaces. Upon further
investigation, I see that they are not.

I suggest leaving the implementations unchanged for now;
there are too many issues with doing it 'right' to go into
here. But perhaps add documentation that is consistent and
indicates that the functions are limited in scope. Patches
attached.


----------------------------------------------------------------------

Comment By: Facundo Batista (facundobatista)
Date: 2004-12-26 15:52

Message:
Logged In: YES 
user_id=752496

Please, could you verify if this problem persists in Python 2.3.4
or 2.4?

If yes, in which version? Can you provide a test case?

If the problem is solved, from which version?

Note that if you fail to answer in one month, I'll close this bug
as "Won't fix".

Thank you! 

.    Facundo

----------------------------------------------------------------------

Comment By: Facundo Batista (facundobatista)
Date: 2004-12-26 15:52

Message:
Logged In: YES 
user_id=752496

Could you please provide a test case?

----------------------------------------------------------------------

Comment By: Andrew I MacIntyre (aimacintyre)
Date: 2002-12-11 07:56

Message:
Logged In: YES 
user_id=250749

There is a sort of convention in Unix - 
  somehost:/path/to/file
which comes from NFS, but has been used by tar (for remote 
tapes via rsh) and ssh's scp, and I believe has been used by 
some ftp clients (ncftp?)

However as far as I know you can't pass such a path to open
() or os.access(), so your basic point still has validity.

----------------------------------------------------------------------

Comment By: Mike Brown (mike_j_brown)
Date: 2002-12-07 10:24

Message:
Logged In: YES 
user_id=371366

by 'host embedded in the URL' in the first sentence I 
meant 'host embedded in it' [the path]

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=649974&group_id=5470


More information about the Python-bugs-list mailing list