[Tutor] How to list/process files with identical character strings

Wed Jun 25 09:35:03 CEST 2014

On 25.06.2014 00:55, Alex Kleider wrote:
>
> I was surprised that the use of dictionaries was suggested, especially
> since we were told there were many many files.
>

The OP was talking about several thousands of files, which is, of 
course, too many for manual processing, but is far from an impressive 
number of elements for a Python dictionary on any modern computer.
Dictionaries are fast and efficient and their memory consumption is a 
factor you will have to think about only in extreme cases (and this is 
definitely not one of them). What is more, your sequential approach of 
always comparing a pair of elements hides the fact that you will still 
have the filenames in memory as a list (at least this is what os.listdir 
would return) and the difference between that and the proposed 
dictionary is not that huge.

What's more important in my opinion is that while the two approaches may 
look equally potent for the given example, the dictionary provides more 
flexibility, i.e., the code is easier to adjust to new problems. Think 
of the afore-mentioned situation that you could also have three parts of 
a file instead of two. While your suggestion would have to be rewritten 
almost from scratch, very little changes would be required to the 
dictionary-based code.

Best,
Wolfgang