[Tutor] a few question about my evolving program

Clayton Kirkwood crk at godblessthe.us
Wed Aug 12 20:04:46 CEST 2015



> -----Original Message-----
> From: Tutor [mailto:tutor-bounces+crk=godblessthe.us at python.org] On
> Behalf Of Steven D'Aprano
> Sent: Wednesday, August 12, 2015 6:43 AM
> To: tutor at python.org
> Subject: Re: [Tutor] a few question about my evolving program
> 
> On Tue, Aug 11, 2015 at 08:23:00PM -0700, Clayton Kirkwood wrote:
> 
> > Question 2:
> > My current code:
> > See "Look here" below.
> 
> 
> There's an art to picking good variable names, neither too short nor too
long,
> just descriptive enough without being too verbose.
> 
> Or to put it another way...
> 
> The existence of a systematic application of knowledge or skill in
effecting a
> desired result in such a way as to be in accordance with the actual state
of
> reality is a factual statement when applied to the technique of making a
> agreeable selection between hypothetical candidate appellations for
> variables.
> 
> *wink*
> 
> I'm afraid that reading your code gives me a headache. You have
> master_directory_file_list (which is a dict, not a list) and
directory_file_list;
> also target_filename_size (a dict, not a size), current_directory_path,
> target_directory, current_file_list, file_list, and file_file_path_list,
although I
> admit that my head is spinning at the moment and that last one might be
> inaccurate :-)
> 
> I don't know how you can keep track of all those so-very-similar variable
> names, especially since some of them lie about what they are!
> (Claiming to be a list when they are actually dicts, etc.) Better names
will
> help.
> 
> It is rare that stating what *type* a variable is will be helpful. E.g.
> you normally wouldn't say "count_int" or "width_float", rather "count"
> and "width".
> 
> You need to find a balance between variable names which are too generic
> and non-descriptive, and those which are too verbose and detailed. That
will
> come with experience, and from reading other people's code to see what
> works and what doesn't work.
> 
> It will also help if you can break your code up into small, self-contained
> functions which can be digested by the reader in isolation. For example,
if I
> were writing your code, I might do something like this:
> 
> 
> def name_is_seen_before(filename, already_seen):
>     """Returns whether or not filename has already been seen.
> 
>     If the filename has not been seen before, it is added to the
>     set of already seen filenames, and False is returned; otherwise
>     True is returned.
>     """
>     if filename in already_seen:
>         return True
>     already_seen.add(filename)
>     return False
> 
> 
> Then use it something like this:
> 
> already_seen = set()
> duplicates = set()
> for name in bunch_of_names:
>     if name_is_seen_before(name, already_seen):
>          duplicates.add(name)
> 
> 
> Obviously you have to get the bunch_of_names somehow first. To do that,
> you can also simplify the process of iterating over files with a filter
that
> discards the files you don't care about:
> 
> def is_media_file(filename):
>     extensions = ['jpg', 'gif', 'png']  # etc.
>     base, ext = os.path.splitext(filename)
>     return ext[1:].lower() in extensions
> 
> def filter_media(filenames):
>     media_files = []
>     for name in filenames:
>         if is_media_file(name):
>             media_files.append(name)
>     return media_files
> 
> 
> The advantage of this is that you can write these small functions
> independently of the rest of your loop, you can test them easily and
satisfy
> yourself that they work correctly, and then forget all the internal
details of
> how they work when you go to use them:
> 
> 
> for dir_path, directories, filenames in os.walk(main_dir):
>     filenames = filter_media(filenames)
>     # ... continue processing ...
> 
> 
> And you don't need to care that variable names are duplicated in different
> functions, because each function is self-contained. There's no confusion
> between variable "name" in one function and "name" in another, so there's
> no need to try to come up with distinctive names for them both.
> 
> The point being, the *details* of how the media files are filtered are not
> important. You can "look inside" the filter_media() function when you care
> about the details, and when you don't, you can just treat it as a
black-box
> that takes a list of file names and returns only those which are media
files.
> 
> Try redesigning your code to be a bit more hierarchical in this way, and
see if
> that makes it easier for you to solve the problem.

Thanks for the advice. I had much shorter names but, while still learning
and comparing two files from perhaps different directories, it was less
confusing for me to spell it out. Be patient, I am still learning, one step
at a time. I'm sure you don't remember previous code, but I have finally
been able to get rid of parentheses in for and if statements. It goes
against everything that I used to do. I have already rewritten sections of
code as I have become more experienced. So I am getting there....
Crk

> 
> 
> 
> --
> Steve
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor



More information about the Tutor mailing list