[Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator
Akira Li
4kir4.1i at gmail.com
Sun Jun 29 20:32:53 CEST 2014
Chris Angelico <rosuav at gmail.com> writes:
> On Sat, Jun 28, 2014 at 11:05 PM, Akira Li <4kir4.1i at gmail.com> wrote:
>> Have you considered adding support for paths relative to directory
>> descriptors [1] via keyword only dir_fd=None parameter if it may lead to
>> more efficient implementations on some platforms?
>>
>> [1]: https://docs.python.org/3.4/library/os.html#dir-fd
>
> Potentially more efficient and also potentially safer (see 'man
> openat')... but an enhancement that can wait, if necessary.
>
Introducing the feature later creates unnecessary incompatibilities.
Either it should be explicitly rejected in the PEP 471 and
something-like `os.scandir(os.open(relative_path, dir_fd=fd))` recommended
instead (assuming `os.scandir in os.supports_fd` like `os.listdir()`).
At C level it could be implemented using fdopendir/openat or scandirat.
Here's the function description using Argument Clinic DSL:
/*[clinic input]
os.scandir
path : path_t(allow_fd=True, nullable=True) = '.'
*path* can be specified as either str or bytes. On some
platforms, *path* may also be specified as an open file
descriptor; the file descriptor must refer to a directory. If
this functionality is unavailable, using it raises
NotImplementedError.
*
dir_fd : dir_fd = None
If not None, it should be a file descriptor open to a
directory, and *path* should be a relative string; path will
then be relative to that directory. if *dir_fd* is
unavailable, using it raises NotImplementedError.
Yield a DirEntry object for each file and directory in *path*.
Just like os.listdir, the '.' and '..' pseudo-directories are skipped,
and the entries are yielded in system-dependent order.
{parameters}
It's an error to use *dir_fd* when specifying *path* as an open file
descriptor.
[clinic start generated code]*/
And corresponding tests (from test_posix:PosixTester), to show the
compatibility with os.listdir argument parsing in detail:
def test_scandir_default(self):
# When scandir is called without argument,
# it's the same as scandir(os.curdir).
self.assertIn(support.TESTFN, [e.name for e in posix.scandir()])
def _test_scandir(self, curdir):
filenames = sorted(e.name for e in posix.scandir(curdir))
self.assertIn(support.TESTFN, filenames)
#NOTE: assume listdir, scandir accept the same types on the platform
self.assertEqual(sorted(posix.listdir(curdir)), filenames)
def test_scandir(self):
self._test_scandir(os.curdir)
def test_scandir_none(self):
# it's the same as scandir(os.curdir).
self._test_scandir(None)
def test_scandir_bytes(self):
# When scandir is called with a bytes object,
# the returned entries names are still of type str.
# Call `os.fsencode(entry.name)` to get bytes
self.assertIn('a', {'a'})
self.assertNotIn(b'a', {'a'})
self._test_scandir(b'.')
@unittest.skipUnless(posix.scandir in os.supports_fd,
"test needs fd support for posix.scandir()")
def test_scandir_fd_minus_one(self):
# it's the same as scandir(os.curdir).
self._test_scandir(-1)
def test_scandir_float(self):
# invalid args
self.assertRaises(TypeError, posix.scandir, -1.0)
@unittest.skipUnless(posix.scandir in os.supports_fd,
"test needs fd support for posix.scandir()")
def test_scandir_fd(self):
fd = posix.open(posix.getcwd(), posix.O_RDONLY)
self.addCleanup(posix.close, fd)
self._test_scandir(fd)
self.assertEqual(
sorted(posix.scandir('.')),
sorted(posix.scandir(fd)))
# call 2nd time to test rewind
self.assertEqual(
sorted(posix.scandir('.')),
sorted(posix.scandir(fd)))
@unittest.skipUnless(posix.scandir in os.supports_dir_fd,
"test needs dir_fd support for os.scandir()")
def test_scandir_dir_fd(self):
relpath = 'relative_path'
with support.temp_dir() as parent:
fullpath = os.path.join(parent, relpath)
with support.temp_dir(path=fullpath):
support.create_empty_file(os.path.join(parent, 'a'))
support.create_empty_file(os.path.join(fullpath, 'b'))
fd = posix.open(parent, posix.O_RDONLY)
self.addCleanup(posix.close, fd)
self.assertEqual(
sorted(posix.scandir(relpath, dir_fd=fd)),
sorted(posix.scandir(fullpath)))
# check that fd is still useful
self.assertEqual(
sorted(posix.scandir(relpath, dir_fd=fd)),
sorted(posix.scandir(fullpath)))
--
Akira
More information about the Python-Dev
mailing list