[Python-bugs-list] [ python-Bugs-513572 ] isdir behavior getting odder on UNC path

noreply@sourceforge.net noreply@sourceforge.net
Sun, 10 Mar 2002 01:03:35 -0800


Bugs item #513572, was opened at 2002-02-05 21:07
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513572&group_id=5470

Category: Python Library
Group: Python 2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Gary Herron (herron)
>Assigned to: Nobody/Anonymous (nobody)
Summary: isdir behavior getting odder on UNC path

Initial Comment:
It's been documented in earlier version of Python on 
windows that os.path.isdir returns true on a UNC 
directory only if there was an extra backslash at the 
end of the argument.  In Python2.2 (at least on 
windows 2000) it appears that *TWO* extra backslashes 
are needed.

Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit 
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for 
more information.
>>>
>>> import os
>>> os.path.isdir('\\trainer\island')
0
>>> os.path.isdir('\\trainer\island\')
0
>>> os.path.isdir('\\trainer\island\\')
1
>>>

In a perfect world, the first call should return 1, 
but never has.  In older versions of python, the 
second returned 1, but no longer.

In limited tests, appending 2 or more backslashes to 
the end of any pathname returns the correct answer in 
both isfile and isdir.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-03-10 04:03

Message:
Logged In: YES 
user_id=31435

Gordon, none of those are UNC roots -- they follow the 
rules exactly as stated for non-UNC paths:  MS stat() 
recognizes \ME\E\java if and only if there's no trailing 
backslash.  That's why your first example succeeds.  The 
complication is that Python removes one trailing 
backslash "by magic" unless the path "looks like a root", 
and none of these do.  That's why your third example 
works.  Your second and fourth examples fail because you 
specified two trailing backslashes in those, and Python 
only removes one of them by magic.

An example of "a UNC root" would be \ME\E.  The MS stat() 
recognizes a root directory if and only if it *does* have a 
trailing backslash, and Python's magical backslash removal 
doesn't know UNC roots from a Euro symbol.  So the only way 
to get Python's isdir() (etc) to recognize \ME\E is to 
follow it with two backslashes, one because Python strips 
one away (due to not realizing "it looks like a root"), and 
another else MS stat() refuses to recognize it.

Anyway, I'm unassigning this now, cuz MarkH isn't paying 
any attentino.  If someone wants to write a pile of tedious 
code to "recognize a UNC root when it sees one", I'd accept 
the patch.  I doubt I'll get it to it myself in this 
lifetime.

----------------------------------------------------------------------

Comment By: Gordon B. McMillan (gmcm)
Date: 2002-03-07 10:31

Message:
Logged In: YES 
user_id=4923

Data point:
 run on a win2k box, where \ME is an NT box
Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit 
(Intel)] on win32
>>> os.path.isdir(r"\ME\E\java")
1
>>> os.path.isdir(r"\ME\E\java\")
0
>>> os.path.isdir("\\ME\E\java\")
1
>>> os.path.isdir("\\ME\E\java\\")
0


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-02-11 03:28

Message:
Logged In: YES 
user_id=31435

Mark, what do you think about a different approach here?

1. Leave the string alone and *try* stat.  If it   
succeeds, great, we're done.

2. Else if the string doesn't have a trailing (back)slash, 
append one and try again.  Win or lose, that's the end.

3. Else the string does have a trailing (back)slash.  If 
the string has more than one character, strip a trailing 
(back)slash and try again.  Win or lose, that's the end.

4. Else the string is a single (back)slash, yet stat() 
failed.  This shouldn't be possible.

It doubles the number of stats in cases where the file path 
doesn't correspond to anything that exists.  OTOH, MS's 
(back)slash rules are undocumented and incomprehensible 
(read their implementation of stat() for the whole truth -- 
we're not out-thinking lots of it now, and the gimmick 
added after 1.5.2 to out-think part of it is at least 
breaking Gary's thoroughly sensible use).

----------------------------------------------------------------------

Comment By: Gary Herron (herron)
Date: 2002-02-11 03:03

Message:
Logged In: YES 
user_id=395736

Sorry, but I don't have much of an idea which versions I 
was refering to.  I picked up the idea of an extra 
backslashes in a faq from a web site, the search for which 
I can't seem to reproduce.  It claimed one backslash was 
enough, but did not specify a python version.  It *might* 
have been old enough to be pre 1.5.2.
 
The two versions I can test are 1.5.1 (where one backslash 
is enough) and 2.2 (where two are required).  This seems 
to me to support (or at least not contradict) Tim's 
hypothesis.

Gary


----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-02-10 13:57

Message:
Logged In: YES 
user_id=31435

Gary, exactly what do you mean by "older versions of 
Python"?  That is, specifically which versions?

The Microsoft stat() function is extremely picky about 
trailing (back)slashes.  For example, if you have a 
directory c:/python, and pass "c:/python/" to the MS stat
(), it claims no such thing exists.  This isn't documented 
by MS, but that's how it works:  a trailing (back)slash is 
required if and only if the path passed in "is a root".  So 
MS stat() doesn't understand "/python/", and doesn't 
understand "d:" either.  The former doesn't tolerate a 
(back)slash, while the latter requires one.

This is impossible for people to keep straight, so after 
1.5.2 Python started removing (back)slashes on its own to 
make MS stat() happy.  The code currently leaves a trailing 
(back)slash alone if and only if one exists, and in 
addition of these obtains:

1) The (back)slash is the only character in the path.
or
2) The path has 3 characters, and the middle one is a colon.

UNC roots don't fit either of those, so do get one (back)
slash chopped off.  However, just as for any other roots, 
the MS stat() refuses to recognize them as valid unless 
they do have a trailing (back)slash.  Indeed, the last time 
I applied a contributed patch to this code, I added a

/* XXX UNC root drives should also be exempted? */

comment there.

However, this explanation doesn't make sense unless by 
"older versions of Python" you mean nothing more recent 
than 1.5.2.  If I'm understanding the source of the 
problem, it should exist in all Pythons after 1.5.2.  So if 
you don't see the same problem in 1.6, 2.0 or 2.1, I'm on 
the wrong track.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-02-08 18:33

Message:
Logged In: YES 
user_id=31435

BTW, it occurs to me that this *may* be a consequence of 
whatever was done in 2.2 to encode/decode filename strings 
for system calls on Windows.  I didn't follow that, and 
Mark may be the only one who fully understands the details.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-02-08 18:17

Message:
Logged In: YES 
user_id=31435

Here's the implementation of Windows isdir():

def isdir(path):
.    """Test whether a path is a directory"""
.    try:
.        st = os.stat(path)
.    except os.error:
.        return 0
.    return stat.S_ISDIR(st[stat.ST_MODE])

That is, we return whatever Microsoft's stat() tells us, 
and our code is the same in 2.2 as in 2.1.  I don't have 
Win2K here, and my Win98 box isn't on a Windows network so 
I can't even try real UNC paths here.  Reassigning to MarkH 
in case he can do better on either count.

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2002-02-08 17:05

Message:
Logged In: YES 
user_id=6380

Tim, I hate to do this to you, but you're the only person I
trust with researching this. (My laptop is currently off the
net again. :-( )


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513572&group_id=5470