New file utility for shutil - linktree.py

David MacQuigg macquigg at cadence.com
Mon Dec 30 14:29:42 EST 2002


I need a 'linktree' utility to create a "mock hierarchy" for testing patches to our mongo software distributions (~2GB, 10,000 files, hundreds of dirs and symlinks, some of which are broken).  I found some utilities in 'shutil.py', but no good.  'copytree' copies the entire 2GB hierarchy!  The mock hierarchy should have just the new patch files, and links for everything else.

I searched the comp.lang.python newsgroup, and even pinged the python-dev group.  Nothing!  So I'm writing my own.  It occurs to me that with a little more effort, this could be a robust and general purpose utility, maybe added to shutil.py.  In writing this utility I was surprised to see that the file utilities are not as well-designed as the rest of Python.  There are some nice high-level functions (exists, isfile, etc.) but if you need primitive functions (testing for files and directories, but not links, etc.) your choices are 1) Figure out the subtle and undocumented behavior of these high-level functions in following links, and use non-obvious combinations of these functions to get what you want, or 2) Go straight to the primitive 'stat' functions, and still have a mess.

I decided to extend the 'os.path.exists' function, adding an optional argument to specify *exactly* what you want tested.  This makes 'linktree' simple and easy to modify.  Everyone has different preferences on how linktree should work.  Our existing utility (written in csh) is hard to read, and even harder to modify.  The Python code below is clearly superior.

My questions for this group are:
1) Is 'linktree' or something like it generally useful, or too specialized for inclusion in the Python distribution?
2) Is something like this already available?
3) Can you think of any improvements?  Other applications I should consider?
4) Any volunteers for code-bashing?  Especially Windows and other platforms.  I can handle the Unix side.

Happy New Year :>)

- Dave

*************************************************************     *
* David MacQuigg               * email:  macquigg at cadence.com  *  *
* Principal Product Engineer   * phone:  USA 520-721-4583      *  *  *
* Analog Artist                                                *  *  *
*                                * 9320 East Mikelyn Lane       * * *
* Cadence Design Systems, Inc.   * Tucson, Arizona 85710          *
*************************************************************     * 

linktree.py          v 1.2                                        DMQ 12/30/02
Creates a new tree for non-destructive testing of patches to an old tree.
'oldtree' is an existing hierarchy, which must remain undisturbed.
'newtree' arrives with just the patch files, all in their proper place
   in the hierarchy.
This routine fills in the new hierarchy with links to all the unchanged files
in the old hierarchy.  'newtree' then behaves just like a complete hierarchy,
but without duplicating all the files in 'oldtree'. 
Warning: Diagnostic output can be huge on a large hierarchy.  You may want to
redirect it to a logfile, or # out some of the print statements below.
"""

import os, sys
from os.path import *
from futils import exists  # Replace 'os.path.exists()'.

def linktree(oldtree, newtree):
  """linktree(oldtree, newtree) -> None
  Leave 'oldtree' undisturbed.  Recursively add links to 'newtree', until it
  looks just like 'oldtree', but with the new files patched in. 
  """
  names = os.listdir(oldtree)
  print 'Candidates: ', names
  for name in names:
    oldpath = abspath(join(oldtree, name))
    newpath = abspath(join(newtree, name))
    if not exists(newpath, 'fdl'):  # Any newpath, including an invalid link.
      if not exists(oldpath, 'fdv'):  # Don't link to a bad link.
        print '*** Error *** Not a file, directory, or valid link:\n', oldpath
        continue
      print 'New link: ', newpath
      os.symlink(oldpath, newpath)  # OK if oldpath is a link.          #1
    elif exists(newpath, 'f'):  # A file, and only a file.
      print 'As is: ', newpath 
      pass  # Leave new files as is.
    elif exists(newpath, 'd'):  # A directory, and only a directory.
      print 'Down one level \n', newpath
      linktree(oldpath, newpath)  # Recursive call
      print 'Up one level'
    else:
      print '*** Error *** Not a simple file or directory:\n', newpath  #2

if __name__ == '__main__':
  linktree(sys.argv[1], sys.argv[2])
"""  
Footnotes:
#1  We avoid the problem of symlink copying a relative link from
    oldtree to a place where it won't work in newtree, because
    oldpath is always absolute path to the old tree.
    Question: Should we convert the oldpath to a real path to
    avoid making a link to a link?
#2  Future versions of this utility may handle newtrees that come
    with links, but for now, just simple files and directories.
Revnotes:
1.1 Intitial version - used combinations of functions in os.path
    (isfile, isdir, islink, exists).
1.2 Logic clarified by using extended version of 'exists()' function.
    New option states precisely what is tested for existence.
"""




More information about the Python-list mailing list