[Tutor] a few question about my evolving program

Clayton Kirkwood crk at godblessthe.us
Wed Aug 12 05:23:00 CEST 2015


Question 1:
What is the purpose and how is the following definition focused on the *?
Turns out, you can't actually put the asterisk in the code, so what does it
mean?
os.stat(path, *, dir_fd=None, follow_symlinks=True)

Question 2:
My current code:
See "Look here" below.

#Program to find duplicated pictures in my picture directory
#Presumably, if the file exists in a subdirectory I can remove if from the
parent picture directory
#
#Clayton Kirkwood
#01Aug15

import os
from os.path import join, getsize, splitext

target_directory = "/users/Clayton/Pictures"      #directory we are checking
main_dir = "/users/Clayton/Pictures"        #directory at the top of the
tree
master_directory_file_list = {}
target_filename_size = {}
duplicate_files = 0
#target_directory_file_list = []

#create directory lists of good filenames
for dir_path, directories, filenames in os.walk(main_dir):
#    print("filenames = ", filenames, "\n")
    good_filenames=[]           #new list of good filenames for each
directory
    for filename in filenames:
        ext = ''
        prefix, ext = splitext(filename)
        if ext and ext[1:].lower() in ('jpg', 'png', 'avi', 'mp4', 'mov',
'bmp'):
            good_filenames.append(filename)
    master_directory_file_list[dir_path] = good_filenames       #this is a
list as the value of a dict

target_directory_file_list = master_directory_file_list[target_directory]
for target_filename in target_directory_file_list:
    stat_info = os.stat(target_directory + '/' + target_filename,
follow_symlinks = False)
    target_filename_size[target_filename] = stat_info.st_size
    
for current_directory_path in master_directory_file_list.keys():
    if current_directory_path == target_directory:
        continue            #skip the target directory
     #time to find duplicates in subdirectories and remove them from top
directory
#        print("sub-directory:  ",current_directory_path, ":",
directory_file_list[current_directory_path],":\n")
    current_file_list = master_directory_file_list[current_directory_path]
#        print(file_list)

    for current_filename in current_file_list:
#        print( "looking at file  ", filename, "  in
top_directory_file_list:   ", top_directory_file_list )
#        print( "and in current_directory_path:  ",  current_directory_path)

Look here:

         if current_filename in target_directory_file_list:
#top_directory_file_list
That's it:<)) Go down to the bottom now:<))

            current_stat_info = os.stat(current_directory_path + '/' +
current_filename, follow_symlinks = False )
            current_file_size = current_stat_info.st_size
            if current_file_size == target_filename_size[current_filename]:
                #the filename is a duplicate and the size is a duplicate:
they are the same file
                print( "file ", current_filename, "size: ",
current_file_size, " found in both current_directory_path ",
current_directory_path,
                  " and ", target_directory, "\n")
                duplicate_files =+ 1

             else:
                print( "file ", current_filename, " not a duplicate\n")

current_filename = 'IMG00060.jpg'

target_directory_file_list = ['2010-11-02 15.58.30.jpg', '2010-11-02
15.58.45.jpg', '2010-11-25 09.42.59.jpg', '2011-03-19 19.32.09.jpg',
'2011-05-28 17.13.38.jpg', '2011-05-28 17.26.37.jpg', '2012-02-02
20.16.46.jpg', '218.JPG', 'honda accident 001.jpg', 'honda accident
002.jpg', 'honda accident 003.jpg', 'honda accident 004.jpg', 'honda
accident 005.jpg', 'honda accident 006.jpg', 'honda accident 007.jpg',
'Image (1).jpg', 'Image.jpg', 'IMG.jpg', 'IMG00003.jpg', 'IMG00040.jpg',
'IMG00058.jpg', 'IMG_0003.jpg', 'IMG_0004.jpg', 'IMG_0005.jpg',
'IMG_0007.jpg', 'IMG_0008.jpg', 'IMG_0009.jpg', 'IMG_0010.jpg', 'Mak diploma
handshake.jpg', 'New Picture.bmp', 'temp 121.jpg', 'temp 122.jpg', 'temp
220.jpg', 'temp 320.jpg', 'temp 321.jpg', 'temp 322.jpg', 'temp 323.jpg',
'temp 324.jpg', 'temp 325.jpg', 'temp 326.jpg', 'temp 327.jpg', 'temp
328.jpg', 'temp 329.jpg', 'temp 330.jpg', 'temp 331.jpg', 'temp 332.jpg',
'temp 333.jpg', 'temp 334.jpg', 'temp 335.jpg', 'temp 336.jpg', 'temp
337.jpg', 'temp 338.jpg', 'temp 339.jpg', 'temp 340.jpg', 'temp 341.jpg',
'temp 342.jpg', 'temp 343.jpg']

As you can see the current_filename does not exist in target_directory_file
list. Yet, I fall through to the next line. Yes, the indents are all fine: I
wouldn't have gotten to running code otherwise.  I turned my head upside
down and still couldn't see why it doesn't work and what I am missing?

TIA,

Clayton




More information about the Tutor mailing list