Arrange files according to a text file

Ric at rdo Ric at rdo
Sun Aug 28 02:21:19 CEST 2011


On Sun, 28 Aug 2011 00:48:20 +0100, MRAB <python at mrabarnett.plus.com>
wrote:

>On 28/08/2011 00:18, Ric at rdo.python.org wrote:
>> Thank you so much. The code worked perfectly.
>>
>> This is what I tried using Emile code. The only time when it picked
>> wrong name from the list was when the file was named like this.
>>
>> Data Mark Stone.doc
>>
>> How can I fix this? Hope I am not asking too much?
>>
>Have you tried the alternative word orders, "Mark Stone" as well as
>"Stone, Mark", picking whichever name has the best ratio for either?
>>

Yes I tried and the result was the same. I will try to work out
something. thank you. 
 
>> import os
>> from difflib import SequenceMatcher as SM
>>
>> path = r'D:\Files '
>> txt_names = []
>>
>>
>> with open(r'D:/python/log1.txt') as f:
>>      for txt_name in f.readlines():
>>          txt_names.append(txt_name.strip())
>>
>> def ignore(x):
>>       return x in ' ,.'
>>
>> for filename in os.listdir(path):
>>       ratios = [SM(ignore,filename,txt_name).ratio() for txt_name in
>> txt_names]
>>       best = max(ratios)
>>       owner = txt_names[ratios.index(best)]
>>       print filename,":",owner
>>
>>
>>
>>
>>
>> On Sat, 27 Aug 2011 14:08:17 -0700, Emile van Sebille<emile at fenx.com>
>> wrote:
>>
>>> On 8/27/2011 1:15 PM Ric at rdo.python.org said...
>>>>
>>>> Hello Emile ,
>>>>
>>>> Thank you for the code below as I have not encountered SequenceMatcher
>>>> before and would have to take a look at it closer.
>>>>
>>>> My question would it work for a text file list of names about 25k
>>>> lines and a directory with say 100 files inside?
>>>
>>> Sure.
>>>
>>> Emile
>>>
>>>
>>>>
>>>> Thank you once again.
>>>>
>>>>
>>>> On Sat, 27 Aug 2011 11:06:22 -0700, Emile van Sebille<emile at fenx.com>
>>>> wrote:
>>>>
>>>>> On 8/27/2011 10:03 AM Ric at rdo.python.org said...
>>>>>> Hello,
>>>>>>
>>>>>> What would be the best way to accomplish this task?
>>>>>
>>>>> I'd do something like:
>>>>>
>>>>>
>>>>> usernames = """Adler, Jack
>>>>> Smith, John
>>>>> Smith, Sally
>>>>> Stone, Mark""".split('\n')
>>>>>
>>>>> filenames = """Smith, John - 02-15-75 - business files.doc
>>>>> Random Data - Adler Jack - expenses.xls
>>>>> More Data Mark Stone files list.doc""".split('\n')
>>>>>
>>>> >from difflib import SequenceMatcher as SM
>>>>>
>>>>>
>>>>> def ignore(x):
>>>>>       return x in ' ,.'
>>>>>
>>>>>
>>>>> for filename in filenames:
>>>>>       ratios = [SM(ignore,filename,username).ratio() for username in
>>>>> usernames]
>>>>>       best = max(ratios)
>>>>>       owner = usernames[ratios.index(best)]
>>>>>       print filename,":",owner
>>>>>
>>>>>
>>>>> Emile
>>>>>
>>>>>
>>>>>
>>>>>> I have many files in separate directories, each file name
>>>>>> contain a persons name but never in the same spot.
>>>>>> I need to find that name which is listed in a large
>>>>>> text file in the following format. Last name, comma
>>>>>> and First name. The last name could be duplicate.
>>>>>>
>>>>>> Adler, Jack
>>>>>> Smith, John
>>>>>> Smith, Sally
>>>>>> Stone, Mark
>>>>>> etc.
>>>>>>
>>>>>>
>>>>>> The file names don't necessary follow any standard
>>>>>> format.
>>>>>>
>>>>>> Smith, John - 02-15-75 - business files.doc
>>>>>> Random Data - Adler Jack - expenses.xls
>>>>>> More Data Mark Stone files list.doc
>>>>>> etc
>>>>>>
>>>>>> I need some way to pull the name from the file name, find it in the
>>>>>> text list and then create a directory based on the name on the list
>>>>>> "Smith, John" and move all files named with the clients name into that
>>>>>> directory.
>>>>>
>>>



More information about the Python-list mailing list