[Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

Akira Li 4kir4.1i at gmail.com
Sun Jun 29 20:32:53 CEST 2014


Chris Angelico <rosuav at gmail.com> writes:

> On Sat, Jun 28, 2014 at 11:05 PM, Akira Li <4kir4.1i at gmail.com> wrote:
>> Have you considered adding support for paths relative to directory
>> descriptors [1] via keyword only dir_fd=None parameter if it may lead to
>> more efficient implementations on some platforms?
>>
>> [1]: https://docs.python.org/3.4/library/os.html#dir-fd
>
> Potentially more efficient and also potentially safer (see 'man
> openat')... but an enhancement that can wait, if necessary.
>

Introducing the feature later creates unnecessary incompatibilities.
Either it should be explicitly rejected in the PEP 471 and
something-like `os.scandir(os.open(relative_path, dir_fd=fd))` recommended
instead (assuming `os.scandir in os.supports_fd` like `os.listdir()`).

At C level it could be implemented using fdopendir/openat or scandirat.

Here's the function description using Argument Clinic DSL:

/*[clinic input]

os.scandir

    path : path_t(allow_fd=True, nullable=True) = '.'

        *path* can be specified as either str or bytes. On some
        platforms, *path* may also be specified as an open file
        descriptor; the file descriptor must refer to a directory.  If
        this functionality is unavailable, using it raises
        NotImplementedError.

    *

    dir_fd : dir_fd = None

        If not None, it should be a file descriptor open to a
        directory, and *path* should be a relative string; path will
        then be relative to that directory.  if *dir_fd* is
        unavailable, using it raises NotImplementedError.

Yield a DirEntry object for each file and directory in *path*.

Just like os.listdir, the '.' and '..' pseudo-directories are skipped,
and the entries are yielded in system-dependent order.

{parameters}
It's an error to use *dir_fd* when specifying *path* as an open file
descriptor.

[clinic start generated code]*/


And corresponding tests (from test_posix:PosixTester), to show the
compatibility with os.listdir argument parsing in detail:

    def test_scandir_default(self):
        # When scandir is called without argument,
        # it's the same as scandir(os.curdir).
        self.assertIn(support.TESTFN, [e.name for e in posix.scandir()])

    def _test_scandir(self, curdir):
        filenames = sorted(e.name for e in posix.scandir(curdir))
        self.assertIn(support.TESTFN, filenames)
        #NOTE: assume listdir, scandir accept the same types on the platform
        self.assertEqual(sorted(posix.listdir(curdir)), filenames)

    def test_scandir(self):
        self._test_scandir(os.curdir)

    def test_scandir_none(self):
        # it's the same as scandir(os.curdir).
        self._test_scandir(None)

    def test_scandir_bytes(self):
        # When scandir is called with a bytes object,
        # the returned entries names are still of type str.
        # Call `os.fsencode(entry.name)` to get bytes
        self.assertIn('a', {'a'})
        self.assertNotIn(b'a', {'a'})
        self._test_scandir(b'.')

    @unittest.skipUnless(posix.scandir in os.supports_fd,
                         "test needs fd support for posix.scandir()")
    def test_scandir_fd_minus_one(self):
        # it's the same as scandir(os.curdir).
        self._test_scandir(-1)

    def test_scandir_float(self):
        # invalid args
        self.assertRaises(TypeError, posix.scandir, -1.0)

    @unittest.skipUnless(posix.scandir in os.supports_fd,
                         "test needs fd support for posix.scandir()")
    def test_scandir_fd(self):
        fd = posix.open(posix.getcwd(), posix.O_RDONLY)
        self.addCleanup(posix.close, fd)
        self._test_scandir(fd)
        self.assertEqual(
            sorted(posix.scandir('.')),
            sorted(posix.scandir(fd)))
        # call 2nd time to test rewind
        self.assertEqual(
            sorted(posix.scandir('.')),
            sorted(posix.scandir(fd)))

    @unittest.skipUnless(posix.scandir in os.supports_dir_fd,
                         "test needs dir_fd support for os.scandir()")
    def test_scandir_dir_fd(self):
        relpath = 'relative_path'
        with support.temp_dir() as parent:
            fullpath = os.path.join(parent, relpath)
            with support.temp_dir(path=fullpath):
                support.create_empty_file(os.path.join(parent, 'a'))
                support.create_empty_file(os.path.join(fullpath, 'b'))
                fd = posix.open(parent, posix.O_RDONLY)
                self.addCleanup(posix.close, fd)
                self.assertEqual(
                    sorted(posix.scandir(relpath, dir_fd=fd)),
                    sorted(posix.scandir(fullpath)))
                # check that fd is still useful
                self.assertEqual(
                    sorted(posix.scandir(relpath, dir_fd=fd)),
                    sorted(posix.scandir(fullpath)))


--
Akira



More information about the Python-Dev mailing list