[Tutor] How to list/process files with identical character strings

Thu Jun 26 06:47:07 CEST 2014

On 2014-06-25 00:35, Wolfgang Maier wrote:
> On 25.06.2014 00:55, Alex Kleider wrote:
>> 
>> I was surprised that the use of dictionaries was suggested, especially
>> since we were told there were many many files.
>> 
> 
> The OP was talking about several thousands of files, which is, of
> course, too many for manual processing, but is far from an impressive
> number of elements for a Python dictionary on any modern computer.
> Dictionaries are fast and efficient and their memory consumption is a
> factor you will have to think about only in extreme cases (and this is
> definitely not one of them). What is more, your sequential approach of
> always comparing a pair of elements hides the fact that you will still
> have the filenames in memory as a list (at least this is what
> os.listdir would return) and the difference between that and the
> proposed dictionary is not that huge.
> 
> What's more important in my opinion is that while the two approaches
> may look equally potent for the given example, the dictionary provides
> more flexibility, i.e., the code is easier to adjust to new problems.
> Think of the afore-mentioned situation that you could also have three
> parts of a file instead of two. While your suggestion would have to be
> rewritten almost from scratch, very little changes would be required
> to the dictionary-based code.
> 
> Best,
> Wolfgang

Thanks for elucidating this.  I didn't know that "several thousand" 
would still be considered a small number.  If this is the case, then 
certainly your points are well taken.
Gratefully,
alex