Doubled backslashes in Windows paths
eryk sun
eryksun at gmail.com
Fri Oct 7 10:02:25 EDT 2016
On Fri, Oct 7, 2016 at 10:46 AM, Steve D'Aprano
<steve+python at pearwood.info> wrote:
> That's because
>
> "C:
>
> is an illegal volume label (disk name? I'm not really a Windows user, and
> I'm not quite sure what the correct terminology here would be).
It's not an illegal device name, per se. A DOS device can be defined
with the name '"C:'. For example:
>>> kernel32.DefineDosDeviceW(0, '"C:', 'C:')
1
>>> os.path.getsize(r'\\.\"C:\Windows\py.exe')
889504
However, without the DOS device prefix (either \\.\ or \\?\), Windows
has to normalize the path as a classic DOS path before passing it to
the kernel. Let's see how Windows 10 normalizes this path by setting a
breakpoint on the NtCreateFile system call:
>>> os.path.getsize(r'"C:\Windows\py.exe"')
Breakpoint 0 hit
ntdll!NtCreateFile:
00007ffb`a6c858e0 4c8bd1 mov r10,rcx
A kernel path is stored in an OBJECT_ATTRIBUTES structure, which has
the path, a handle (for opening relative to another object), and flags
such as whether or not the path is case insensitive. The debugger's
!obja extension command shows the contents of this structure:
0:000> !obja @r8
Obja +000000a6c8bef038 at 000000a6c8bef038:
Name is "C:\Windows\py.exe"
OBJ_CASE_INSENSITIVE
We see that the user-mode path normalization code doesn't know what to
make of a path starting with '"', so it just punts the path to the
kernel object manager. In turn the object manager rejects this path
because it's not rooted in the object namespace (i.e. it's not of the
form "\??\..." or "\Device\...", etc):
0:000> pt; r rax
rax=00000000c0000033
The kernel status code 0xC0000033 is STATUS_OBJECT_NAME_INVALID.
Note that a path ending in '"' is still illegal even if we explicitly
use the r'\\.\"C:' DOS device. For example:
>>> os.path.getsize(r'\\.\"C:\Windows\py.exe"')
Breakpoint 0 hit
ntdll!NtCreateFile:
00007ffb`a6c858e0 4c8bd1 mov r10,rcx
0:000> !obja @r8
Obja +000000a6c8bef038 at 000000a6c8bef038:
Name is \??\"C:\Windows\py.exe"
OBJ_CASE_INSENSITIVE
0:000> pt; r rax
rax=00000000c0000033
In this case it fails because the final '"' in the name (after .exe)
is reserved by the I/O manager, along with '<' and '>', respectively
as DOS_DOT, DOS_STAR, and DOS_QM. These characters aren't allowed in
filesystem names. They get used to implement the semantics of DOS
wildcards in NT system calls (the regular '*' and '?' wildcards are
also reserved). See FsRtlIsNameInExpression [1], which, for example, a
filesystem may use in its implementation of the NtQueryDirectoryFile
[2] system call to handle the optional FileName parameter with
wildcard matching.
[1]: https://msdn.microsoft.com/en-us/library/ff546850
[2]: https://msdn.microsoft.com/en-us/library/ff567047
More information about the Python-list
mailing list