Why exception from os.path.exists()?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Thu Jun 7 08:47:15 EDT 2018
On Thu, 07 Jun 2018 10:04:53 +0200, Antoon Pardon wrote:
> On 07-06-18 05:55, Steven D'Aprano wrote:
>> Python strings are rich objects which support the Unicode code point \0
>> in them. The limitation of the Linux kernel that it relies on NULL-
>> terminated byte strings is irrelevant to the question of what
>> os.path.exists ought to do when given a path containing NUL. Other
>> invalid path names return False.
>
> It is not irrelevant. It makes the disctinction clear between possible
> values and impossible values.
That is simply wrong. It is wrong in principle, and it is wrong in
practice, for reasons already covered to death in this thread.
It is *wrong in practice* because other impossible values don't raise
ValueError, they simply return False:
- illegal pathnames under Windows, those containing special
characters like ? > < * etc, simply return False;
- even on Linux, illegal pathnames like "" (the empty string)
return False;
- invalid pathnames with too many path components, or too many
characters in a single component, simply return False;
- the os.path.exists() function is not documented as making
a three-way split between "exists, doesn't exist and invalid";
- and it isn't even true to say that NULL is illegal in pathnames:
there are at least five file systems that allow either NUL bytes:
FAT-8, MFS, HFS, or Unicode \0 code points: HFS Plus and Apple
File System.
And it is *wrong in principle* because in the most general case, there is
no way to tell which pathnames are valid or invalid without querying an
actual file system. In the case of Linux, any directory could be used as
a mount point.
Is "/mnt/some?file" valid or invalid? If an NTFS file system is mounted
on /mnt, it is invalid; if an ext4 file system is mounted there, it is
valid; if there's nothing mounted there, the question is impossible to
answer.
>> As a Python programmer, how does treating NUL specially make our life
>> better?
>
> By treating possible path values differently from impossible path
> values.
But it doesn't do that. "Pathnames cannot contain NUL" is a falsehood
that programmers wrongly believe about paths. HFS Plus and Apple File
System support NULs in paths.
So what it does is wrongly single out one *POSSIBLE* path value to raise
an exception, while other so-called "impossible" path values simply
return False.
But in the spirit of compromise, okay, let's ignore the existence of file
systems like HFS which allow NUL. Apart from Mac users, who uses them
anyway? Let's pretend that every file system in existence, now and into
the future, will prohibit NULs in paths.
Have you ever actually used this feature? When was the last time you
wrote code like this?
try:
flag = os.path.exists(pathname)
except ValueError:
handle_null_in_path()
else:
if flag:
handle_file()
else:
handle_invalid_path_or_no_such_file()
I want to see actual, real code used in production, not made up code
snippets, that demonstrate that this is a useful distinction to make.
Until such time that somebody shows me an actual real-world use-case for
wanting to make this distinction for NULs and NULs alone, I call bullshit.
--
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson
More information about the Python-list
mailing list