Why exception from os.path.exists()?

Thu Jun 7 08:47:15 EDT 2018

On Thu, 07 Jun 2018 10:04:53 +0200, Antoon Pardon wrote:

> On 07-06-18 05:55, Steven D'Aprano wrote:
>> Python strings are rich objects which support the Unicode code point \0
>> in them. The limitation of the Linux kernel that it relies on NULL-
>> terminated byte strings is irrelevant to the question of what
>> os.path.exists ought to do when given a path containing NUL. Other
>> invalid path names return False.
> 
> It is not irrelevant. It makes the disctinction clear between possible
> values and impossible values. 

That is simply wrong. It is wrong in principle, and it is wrong in 
practice, for reasons already covered to death in this thread.

It is *wrong in practice* because other impossible values don't raise 
ValueError, they simply return False:

- illegal pathnames under Windows, those containing special 
  characters like ? > < * etc, simply return False;

- even on Linux, illegal pathnames like "" (the empty string)
  return False;

- invalid pathnames with too many path components, or too many
  characters in a single component, simply return False;

- the os.path.exists() function is not documented as making 
  a three-way split between "exists, doesn't exist and invalid";

- and it isn't even true to say that NULL is illegal in pathnames:
  there are at least five file systems that allow either NUL bytes:
  FAT-8, MFS, HFS, or Unicode \0 code points: HFS Plus and Apple
  File System.

And it is *wrong in principle* because in the most general case, there is 
no way to tell which pathnames are valid or invalid without querying an 
actual file system. In the case of Linux, any directory could be used as 
a mount point.

Is "/mnt/some?file" valid or invalid? If an NTFS file system is mounted 
on /mnt, it is invalid; if an ext4 file system is mounted there, it is 
valid; if there's nothing mounted there, the question is impossible to 
answer.

>> As a Python programmer, how does treating NUL specially make our life
>> better?
> 
> By treating possible path values differently from impossible path
> values.

But it doesn't do that. "Pathnames cannot contain NUL" is a falsehood 
that programmers wrongly believe about paths. HFS Plus and Apple File 
System support NULs in paths.

So what it does is wrongly single out one *POSSIBLE* path value to raise 
an exception, while other so-called "impossible" path values simply 
return False.

But in the spirit of compromise, okay, let's ignore the existence of file 
systems like HFS which allow NUL. Apart from Mac users, who uses them 
anyway? Let's pretend that every file system in existence, now and into 
the future, will prohibit NULs in paths.

Have you ever actually used this feature? When was the last time you 
wrote code like this?

    try:
        flag = os.path.exists(pathname)
    except ValueError:
        handle_null_in_path()
    else:
        if flag:
            handle_file()
        else:
            handle_invalid_path_or_no_such_file()

I want to see actual, real code used in production, not made up code 
snippets, that demonstrate that this is a useful distinction to make.

Until such time that somebody shows me an actual real-world use-case for 
wanting to make this distinction for NULs and NULs alone, I call bullshit.

-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson