[Tutor] improvements on a renaming script

Cameron Simpson cs at zip.com.au
Mon Mar 10 02:08:00 CET 2014


On 09Mar2014 15:22, street.sweeper at mailworks.org <street.sweeper at mailworks.org> wrote:
> A bit of background, I had some slides scanned and a 3-character
> slice of the file name indicates what roll of film it was.
> This is recorded in a tab-separated file called fileNames.tab.
> Its content looks something like:
> 
> p01     200511_autumn_leaves
> p02     200603_apple_plum_cherry_blossoms
> 
> The original file names looked like:
> 
> 1p01_abc_0001.jpg
> 1p02_abc_0005.jpg
> 
> The renamed files are:
> 
> 200511_autumn_leaves_-_001.jpeg
> 200603_apple_plum_cherry_blossoms_-_005.jpeg
> 
> The script below works and has done what I wanted, but I have a
> few questions:
> 
> - In the get_long_names() function, the for/if thing is reading
> the whole fileNames.tab file every time, isn't it?  In reality,
> the file was only a few dozen lines long, so I suppose it doesn't
> matter, but is there a better way to do this?

Read it once, into a dictionary.

I'd rename "get_long_name" to "get_long_names", and have it create
and return a dictionary with keys being the glnAbbrev value and
values being the long name.

So start with:

  long_names = {}

Fill it out by saving the abbrev and long_name for each row, and return
"long_names" at the end of the function.

Then call it once at the start of your program, and then just look things up
directly in the dictionary instead of calling "get_long_name()".

> - Really, I wanted to create a new sequence number at the end of
> each file name, but I thought this would be difficult.  In order
> for it to count from 01 to whatever the last file is per set p01,
> p02, etc, it would have to be aware of the set name and how many
> files are in it.  So I settled for getting the last 3 digits of
> the original file name using splitext().  The strings were unique,
> so it worked out.  However, I can see this being useful in other
> places, so I was wondering if there is a good way to do this.
> Is there a term or phrase I can search on?

Nothing specific comes to mind.

When I do this kind of thing I tend to make an opening pass over
os.listdir() pulling out all the names and noting whatever is
relevant. In your case you might maintain a dictionary of the "base"
filename key (i.e. the filename without the trailing sequence number)
and the maximum sequence number seen for that file.

Then I'd have a short function which was passed this dictionary and
a "base" filename, and returned a new sequence number, being the
first sequence number after the current maximum from the dictionary
for which the constructed new filename did not exist.

Then update the number in the dictionary, probably inside that function.

> - I'd be interested to read any other comments on the code.
> I'm new to python and I have only a bit of computer science study,
> quite some time ago.

My personal habit is to put the main program logic at the top.

I know you can't just move it because you rely on functions when
must be defined first.

However, you can do this:

    def main(argv):
        ... main program logic here ...
        return <your-exit-status,-usually-0>

and put:

    sys.exit(main(sys.argv))

at the bottom of the program.

This has the advantage of having the main program logic at the top
where it is easy to find.

> # rename
> for f in os.listdir(indir):
> 	if f.endswith(".jpg"):
> 		os.rename(
> 			os.path.join(indir,f),os.path.join(
> 				outdir,
> 				get_long_name(get_slice(f))+"_-_"+get_bn_seq(f)+".jpeg")
> 				)

I'd preceed the rename() by computing:

  oldname = os.path.join(indir,f)
  newname = ( os.path.join(outdir,
                           get_long_name(get_slice(f))
              + "_-_" + get_bn_seq(f) + ".jpeg"
            )

and just pass oldname, newname to os.rename().
Easily to read and debug.

Cheers,
-- 
Cameron Simpson <cs at zip.com.au>

Patriotism means to stand by the country. It does not mean to stand by the
President.	- Theodore Roosevelt


More information about the Tutor mailing list