[Tutor] os.walk() with multiple paths

Roel Schroeven roel at roelschroeven.net
Wed May 23 17:17:24 EDT 2018

Pi schreef op 22/05/2018 21:06:
> import os
> files = []
> def find_db(paths):
>      for path in paths.split():
>          for root, dirs, filenames in os.walk(path):
>              for name in filenames:
>                  if name.endswith((".db", ".sqlite", ".sqlite3")):
>                      files.append(name + ', ' + os.path.join(root, name))
>      return sorted(set(files))
> But with more paths gives files only for last path given:
>  >>> find_db("/home/user/Desktop, /dbbs")

Do you really need to pass your paths as a comma-separated string? 
Commas and spaces are legal characters in paths. There could, in theory, 
really exist a path "/home/user/Desktop, /ddbs". Probably not in your 
use case, but still.

If possible, depending on where your paths come from, it's better to 
pass the paths as a list to avoid the ambiguity:

def find_db(paths):
     for path in paths:

find_db(["/home/user/Desktop", "/ddbs"])

Another point: it's far better to make files a local variable in the 
function instead of a global one. Also you can make it a set from the 
beginning: that will avoid duplicates earlier:

def find_db(paths):
     files = set()
     for path in paths:
         for root, dirs, filenames in os.walk(path):
             for name in filenames:
                 if name.endswith(('.db', '.sqlite', '.sqlite3')):
                     files.add(name + ',' + os.path.join(root, name))
     return sorted(files)

Last point: I didn't know endswith can take a tuple of suffixes instead 
of one single suffix. Cool, I learned something new!

The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
   -- Isaac Asimov

Roel Schroeven

More information about the Tutor mailing list