[Tutor] Nested for loops, possibly?

DaveA davea at davea.name
Thu Feb 5 15:59:34 CET 2015



On February 5, 2015 8:27:29 AM EST, Bob Williams <linux at barrowhillfarm.org.uk> wrote:
>Hi,
>
>My script is running under Python 3.4.1 on a 64bit openSUSE linux
>system. It is a backup script making calls to rsync and btrfs-tools,
>and backing up several different paths. Here is the script, my question
>follows below:
>
>**Code**
>import datetime
>import glob
>import os, os.path
>import subprocess
>import sys
>
>if not os.getuid() == 0:
>    print("\n*** This script must be run as root. ***\n")
>    sys.exit()
>
>mnt_path = "/home/bob/A3"
>    
>subprocess.call(["mount", "LABEL=backup", mnt_path])
>if not os.path.ismount(mnt_path):
>    print("\nBackup drive is not mounted\nCheck if it is attached.\n")
>    sys.exit()
>else:
>    print("\nBackup drive mounted at", mnt_path, "\n")
>
>src_path = "/home/bob"
>
>today = datetime.datetime.now()
>fname = today.strftime("%y-%m-%d_%H-%M")
>
>doc_retain = datetime.timedelta(days=90)
>pic_retain = datetime.timedelta(days=90)
>misc_retain = datetime.timedelta(days=90)
>etc_retain = datetime.timedelta(days=30)
>mus_retain = datetime.timedelta(days=30)
>web_retain = datetime.timedelta(days=30)
>rep_retain = datetime.timedelta(days=30)
>
>doc_srcpath = os.path.join(src_path, "Documents")
>pic_srcpath = os.path.join(src_path, "Pictures")
>misc_srcpath = src_path
>etc_srcpath = "/etc"
>music_srcpath = os.path.join(src_path, "music")
>www_srcpath = "/srv/www"
>repo_srcpath = os.path.join(src_path, "download")
>
Three things bother me about this portion of code, Too much code at top-level, too many separate variables, too many global.The last point doesn't matter much, since they're constant.

I'd put these Values into a class, and make a list of instances. If you're not yet comfortable with that, it would also be possible to make a list per "paragraph", and use zip to combine them, as already suggested. 

The class would hold retain, srcpath, syncpath, snappath, etc. And your list would have 7 instances of that class currently, corresponding to your doc, pic, misc, ...

That list would be THE global, replacing these 35 or so. It would be populated something like:

def initialize (worklist=[]):
    worklist . append  (Job (90, src_path, "Documents", "documents", "docsnaps")
    worklist .append (Job (90, src_path,  "Pictures", ...
    ....
    return worklist

Now all the joins and globs are done in the Job initializer, just once.

.....th.join(mnt_path, "documents")
>pic_syncpath = os.path.join(mnt_path, "pictures")
>misc_syncpath = os.path.join(mnt_path, "miscellaneous")
>etc_syncpath = os.path.join(mnt_path, "etc")
>music_syncpath = os.path.join(mnt_path, "music")
>www_syncpath = os.path.join(mnt_path, "www")
>repo_syncpath = os.path.join(mnt_path, "repo")
>
>doc_snappath = os.path.join(mnt_path, "docsnaps", fname)
>pic_snappath = os.path.join(mnt_path, "picsnaps", fname)
>misc_snappath = os.path.join(mnt_path, "miscsnaps", fname)
>etc_snappath = os.path.join(mnt_path, "etcsnaps", fname)
>music_snappath = os.path.join(mnt_path, "musicsnaps", fname)
>www_snappath = os.path.join(mnt_path, "wwwsnaps", fname)
>repo_snappath = os.path.join(mnt_path, "reposnaps", fname)
>
>doc_snaplist = glob.glob(mnt_path + "/docsnaps/*")
>pic_snaplist = glob.glob(mnt_path + "/picsnaps/*")
>misc_snaplist = glob.glob(mnt_path + "/miscsnaps/*")
>etc_snaplist = glob.glob(mnt_path + "/etcsnaps/*")
>music_snaplist = glob.glob(mnt_path + "/musicsnaps/*")
>www_snaplist = glob.glob(mnt_path + "/wwwsnaps/*")
>repo_snaplist = glob.glob(mnt_path + "/reposnaps/*")
>
>def do_sync(source, dest):
>subprocess.call(['rsync', '-av', '--safe-links', '--delete-excluded',
>'-F', source, dest])
>    print("\n")
>
>def create_snaps(newsnap, snapdest):
> subprocess.call(['btrfs', 'subvolume', 'snapshot', newsnap, snapdest])
>    print("\n")
>
>def expire_snaps(snaplist, today, expiry_interval):
>    x = 0
>print("\nDeleting snapshots older than", str(expiry_interval)[:7],
>"...")
>    for i in range(0, len(snaplist)):
>snap_date = datetime.datetime.strptime(snaplist[i][-14:],
>"%y-%m-%d_%H-%M")
>        if today - snap_date >= expiry_interval:
>         subprocess.call(['btrfs', 'subvolume', 'delete', snaplist[i]])
>            x += 1
>    if x == 0:
>print("... No snapshots older than", str(expiry_interval)[:7], "found -
>nothing to do.")
>    else:
>        print("...", x, "snapshot(s) deleted.")
>    print("\n")
>
>def main():
>    print("Backing up ", src_path, "/Documents\n", sep='')
>    do_sync(doc_srcpath, doc_syncpath)
>    create_snaps(doc_syncpath, doc_snappath)
>    print("Documents backup completed.")
>    expire_snaps(doc_snaplist, today, doc_retain)

At this point main becomes something like 

    jobs = initialize ()
    for job in jobs:
           do_sync (job.srcpath, job.syncpath)
           create_snaps (job.syncpath,  job.snappath)
            expire_snaps (job.snaplist, ...
>
>    print("Backing up ", src_path, "/Pictures\n", sep='')
>    do_sync(pic_srcpath, pic_syncpath)
>    create_snaps(pic_syncpath, pic_snappath)
>    print("Pictures backup completed.")
>    expire_snaps(pic_snaplist, today, pic_retain)
>
>    print("Backing up Miscellaneous files\n")      
>    do_sync(misc_srcpath, misc_syncpath)
>    create_snaps(misc_syncpath, misc_snappath)
>    print("Miscellaneous backup completed.")
>    expire_snaps(misc_snaplist, today, misc_retain)
>
>    print("Backing up /etc\n")
>    do_sync(etc_srcpath, etc_syncpath)
>    create_snaps(etc_syncpath, etc_snappath)
>    print("Backup of /etc completed.")
>    expire_snaps(etc_snaplist, today, etc_retain)
>
>    print("Backing up ", src_path, "/music\n", sep='')
>    do_sync(music_srcpath, music_syncpath)
>    create_snaps(music_syncpath, music_snappath)
>    print("Music backup completed.")
>    expire_snaps(music_snaplist, today, mus_retain)
>
>    print("Backing up Web server\n")
>    do_sync(www_srcpath, www_syncpath)
>    create_snaps(www_syncpath, www_snappath)
>    print("Web server backup completed.")
>    expire_snaps(www_snaplist, today, web_retain)
>
>    print("Backing up Download repository\n")
>    do_sync(repo_srcpath, repo_syncpath)
>    create_snaps(repo_syncpath, repo_snappath)
>    print("Download repository backup completed.")
>    expire_snaps(repo_snaplist, today, rep_retain)
>
>    print("\nAll backups completed.")
>    print("\nUnmounting backup drive.\n")
>    subprocess.call(['umount', mnt_path])
>print("\n>>> Please power down the Quad external drive enclosure
><<<\n")
>
>if __name__ == "__main__":
>    main()
>**/Code**
>
>I would like to reduce all those repeated calls to do_sync() in main(),
>for example, to one by putting the *_srcpath and *_*syncpath variables
>into lists (eg. source_list and sync_list) and using a for loop to get
>the first item out of each list, then the second item, etc. Something
>like:
>
>for i in range(0, len(source_list)):
>    for j in range(0, len(sync_list)):
>        do_sync(source_list[i], sync_list[j])
>
>but this will get all the values of sync_list[j] for each value of
>source_list[i], which is not what I want.

If you're sure all the lists are the same length,  you can use zip. But I'd recommend a list of objects. 

>
>I hope this is clear enough to see my problem? I realise that the print
>statements will need some work, I'm just trying to get the
>functionality working.
>
>TIA
>
>Bob

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


More information about the Tutor mailing list