Arrange files according to a text file

Ric at rdo Ric at rdo
Sat Aug 27 19:18:54 EDT 2011


Thank you so much. The code worked perfectly. 

This is what I tried using Emile code. The only time when it picked
wrong name from the list was when the file was named like this.

Data Mark Stone.doc

How can I fix this? Hope I am not asking too much?


import os
from difflib import SequenceMatcher as SM

path = r'D:\Files '
txt_names = []


with open(r'D:/python/log1.txt') as f:
    for txt_name in f.readlines():
        txt_names.append(txt_name.strip())

def ignore(x):
     return x in ' ,.'

for filename in os.listdir(path):
     ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in
txt_names]
     best = max(ratios)
     owner = txt_names[ratios.index(best)]
     print filename,":",owner





On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebille <emile at fenx.com>
wrote:

>On 8/27/2011 1:15 PM Ric at rdo.python.org said...
>>
>> Hello Emile ,
>>
>> Thank you for the code below as I have not encountered SequenceMatcher
>> before and would have to take a look at it closer.
>>
>> My question would it work for a text file list of names about 25k
>> lines and a directory with say 100 files inside?
>
>Sure.
>
>Emile
>
>
>>
>> Thank you once again.
>>
>>
>> On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebille<emile at fenx.com>
>> wrote:
>>
>>> On 8/27/2011 10:03 AM Ric at rdo.python.org said...
>>>> Hello,
>>>>
>>>> What would be the best way to accomplish this task?
>>>
>>> I'd do something like:
>>>
>>>
>>> usernames = """Adler, Jack
>>> Smith, John
>>> Smith, Sally
>>> Stone, Mark""".split('\n')
>>>
>>> filenames = """Smith, John - 02-15-75 - business files.doc
>>> Random Data - Adler Jack - expenses.xls
>>> More Data Mark Stone files list.doc""".split('\n')
>>>
>>>from difflib import SequenceMatcher as SM
>>>
>>>
>>> def ignore(x):
>>>      return x in ' ,.'
>>>
>>>
>>> for filename in filenames:
>>>      ratios = [SM(ignore,filename,username).ratio() for username in
>>> usernames]
>>>      best = max(ratios)
>>>      owner = usernames[ratios.index(best)]
>>>      print filename,":",owner
>>>
>>>
>>> Emile
>>>
>>>
>>>
>>>> I have many files in separate directories, each file name
>>>> contain a persons name but never in the same spot.
>>>> I need to find that name which is listed in a large
>>>> text file in the following format. Last name, comma
>>>> and First name. The last name could be duplicate.
>>>>
>>>> Adler, Jack
>>>> Smith, John
>>>> Smith, Sally
>>>> Stone, Mark
>>>> etc.
>>>>
>>>>
>>>> The file names don't necessary follow any standard
>>>> format.
>>>>
>>>> Smith, John - 02-15-75 - business files.doc
>>>> Random Data - Adler Jack - expenses.xls
>>>> More Data Mark Stone files list.doc
>>>> etc
>>>>
>>>> I need some way to pull the name from the file name, find it in the
>>>> text list and then create a directory based on the name on the list
>>>> "Smith, John" and move all files named with the clients name into that
>>>> directory.
>>>
>



More information about the Python-list mailing list