[Tutor] a quick Q: how to use for loop to read a series of files with .doc end

Fri Oct 7 16:18:29 CEST 2011

On Fri, Oct 7, 2011 at 4:08 PM, lina <lina.lastname at gmail.com> wrote:

>
>
> On Fri, Oct 7, 2011 at 3:39 PM, lina <lina.lastname at gmail.com> wrote:
>
>>
>>
>> On Fri, Oct 7, 2011 at 9:50 AM, Steven D'Aprano <steve at pearwood.info>wrote:
>>
>>> lina wrote:
>>>
>>>  May I ask a further question:
>>>>
>>>>  a
>>>>>>>
>>>>>> {'B': [4, 5, 6], 'E': {1, 2, 3}}
>>>>
>>>
>>> Why is a['B'] a list and a['E'] a set?
>>>
>>>
>>>
>>>
>>>  How can I get the value of
>>>> set(a['E'])+set(a['B'])
>>>>
>>>> I mean, get a new dict 'B+E':[5,7,9]
>>>>
>>>
>>>
>>> You are confusing different things into one question, as if I had asked:
>>>
>>> "How do I make a hard boiled egg? I mean, get a potato salad."
>>>
>>> You must ask a clear question to get a clear answer.
>>>
>>>
>>>
>>> To answer your first question, what do you mean by adding two sets? I can
>>> take the *union* of two sets (anything in either one OR the other):
>>>
>>> >>> a['E'] | set(a['B'])  # one is already a set, no need to convert
>>> {1, 2, 3, 4, 5, 6}
>>>
>>>
>>> or I can take the *intersection* of the two sets (anything in both one
>>> AND the other):
>>>
>>> >>> a['E'] & set(a['B'])
>>> set()
>>>
>>> There are no items in common between the two, so nothing in the
>>> intersection.
>>>
>>>
>>> To get the result you are asking for:
>>>
>>> [5, 7, 9]
>>>
>>> makes no sense. How do you expect to get a *list* by combining two
>>> *sets*? They are different things. Lists have order, sets do not:
>>>
>>> >>> [1, 2, 3] == [3, 2, 1]
>>> False
>>> >>> {1, 2, 3} == {3, 2, 1}
>>> True
>>>
>>>
>>> A list is a sequence of values in order, a set is like a jumble of values
>>> tossed in a bag.
>>>
>>> My *guess* is that you don't care about sets at all, you want two lists:
>>>
>>
>> Thanks, I did not realize the great differences between the list and sets.
>> I was not so sensitive about the concepts before.
>>
>>>
>>>
>>> [1, 2, 3]
>>> [4, 5, 6]
>>>
>>>
>>> and you want to add them item by item to get another list:
>>>
>>> [5, 7, 9]
>>>
>>>
>>> Have I guessed correctly?
>>>
>>>
>>> If so, here's the hard way to do it:
>>>
>>>
>>> first_list = [1, 2, 3]
>>> second_list = [4, 5, 6]
>>> result = []
>>> for i in range(3):
>>>    a = first_list[i]
>>>    b = second_list[i]
>>>    result.append(a + b)
>>>
>>> print(result)
>>>
>>>
>>> Walking along two lists in lock-step like that is so common that Python
>>> has a dedicated function specially for it: zip.
>>>
>>> result = []
>>> for a,b in zip(first_list, second_list):
>>>    result.append(a+b)
>>>
>>>
>>> which can be simplified further to a list comprehension:
>>>
>>> result = [a+b for a,b in zip(first_list, second_list)]
>>>
>>>
>> Thanks, just why the output it's something double, more than I want.
>>
>> #!/bin/python3
>>
>> import os.path
>>
>> TOKENS="BE"
>>
>> LINESTOSKIP=0
>> INFILEEXT=".xpm"
>> OUTFILEEXT=".txt"
>>
>> def dofiles(topdirectory):
>>     for filename in os.listdir(topdirectory):
>>         processfile(filename)
>>
>> def processfile(infilename):
>>     results={}
>>
>>     base, ext =os.path.splitext(infilename)
>>     if ext == INFILEEXT:
>>         text = fetchonefiledata(infilename)
>>         numcolumns=len(text[0])
>>         for ch in TOKENS:
>>             results[ch] = [0]*numcolumns
>>         for line in text:
>>             line = line.strip()
>>         for col, ch in enumerate(line):
>>             if ch in TOKENS:
>>                 results[ch][col]+=1
>>         for k,v in results.items():
>>
> My mistake, here should remove the "for k,v in results.items()"

>             print(results)
>>             summary=[]
>>             for a,b in zip(results['E'],results['B']):
>>                 summary.append(a+b)
>>         writeonefiledata(base+OUTFILEEXT,summary)
>>
>>
>> def fetchonefiledata(inname):
>>     infile = open(inname)
>>     text = infile.readlines()
>>     return text[LINESTOSKIP:]
>>
>> def writeonefiledata(outname,summary):
>>
>>     outfile = open(outname,"w")
>>     for elem in summary:
>>
> another mistake here, I shouldn't have used "for elem in summary"

>         outfile.write(str(summary))
>>
>>
>>
>> if __name__=="__main__":
>>     dofiles(".")
>>
>>
>>  $ python3 counter-vertically-v2.py
>> {'B': [0, 0, 0, 0, 0, 0], 'E': [1, 0, 1, 0, 1, 0]}
>> {'B': [0, 0, 0, 0, 0, 0], 'E': [1, 0, 1, 0, 1, 0]}
>>
>> $ more try.txt
>> [1, 0, 1, 0, 1, 0][1, 0, 1, 0, 1, 0][1, 0, 1, 0, 1, 0][1, 0, 1, 0, 1,
>> 0][1, 0, 1
>> , 0, 1, 0][1, 0, 1, 0, 1, 0]
>>
>> $ more try.xpm
>> aaEbb
>> aEEbb
>> EaEbb
>> EaEbE
>>
>> Thanks,
>>
>
> I thought it might be some loop reason made it double output the results,
> so I made an adjustation in indent, now it showed:
>
> $ python3 counter-vertically-v2.py
> {'B': [0, 0, 0, 0, 0, 0], 'E': [1, 0, 1, 0, 1, 0]}
> {'B': [0, 0, 0, 0, 0, 0], 'E': [1, 0, 1, 0, 1, 0]}
> [1, 0, 1, 0, 1, 0]
> Traceback (most recent call last):
>   File "counter-vertically-v2.py", line 48, in <module>
>     dofiles(".")
>   File "counter-vertically-v2.py", line 13, in dofiles
>     processfile(filename)
>   File "counter-vertically-v2.py", line 31, in processfile
>
>     for a,b in zip(results['E'],results['B']):
> KeyError: 'E'
>
> still two results, but the summary is correct, with a KeyError which I
> don't know how to fix the key error here.
>
>
> Now fixed the excessive output.

Thanks,

but in another case, seems there is a problem, for the line actually is:
"EEEEEEEEEEESEEEEEEEEEEEEEEEE~EEEEEEEEEEEE~EEEEEE~EEEEEEEEEE~EEEEEE~EEEEEEEEEEEE
EEEEEEEEEEEEEEEEEEEEE~EEEEEEEEEEEEEEEEEEEEEEEE~EEE~EEEEEEEEEEEEEEEEEEEEEEEEEEEEE
EEEEEEEEEEEEEEEEEEEEEE~EEEEEEEEEEEEEEEEEE~",

not naked EEEor whatever. it's already in ""

let me think think, and also welcome advice,

actually debugging is enjoyment after the frustration.

> #!/bin/python3
>
> import os.path
>
>
> TOKENS="BE"
> LINESTOSKIP=0
> INFILEEXT=".xpm"
> OUTFILEEXT=".txt"
>
> def dofiles(topdirectory):
>     for filename in os.listdir(topdirectory):
>         processfile(filename)
>
> def processfile(infilename):
>     results={}
>     base, ext =os.path.splitext(infilename)
>     if ext == INFILEEXT:
>         text = fetchonefiledata(infilename)
>         numcolumns=len(text[0])
>         for ch in TOKENS:
>             results[ch] = [0]*numcolumns
>         for line in text:
>             line = line.strip()
>         for col, ch in enumerate(line):
>             if ch in TOKENS:
>                 results[ch][col]+=1
>     for k,v in results.items():
>         print(results)
>     summary=[]
>     for a,b in zip(results['E'],results['B']):
>         summary.append(a+b)
>     print(summary)
>     writeonefiledata(base+OUTFILEEXT,summary)
>
> def fetchonefiledata(inname):
>     infile = open(inname)
>     text = infile.readlines()
>     return text[LINESTOSKIP:]
>
> def writeonefiledata(outname,summary):
>     outfile = open(outname,"w")
>     for elem in summary:
>         outfile.write(str(summary))
>
>
> if __name__=="__main__":
>     dofiles(".")
>
> Thanks all for your time,
>
>
>>
>>>
>>> --
>>> Steven
>>>
>>> ______________________________**_________________
>>> Tutor maillist  -  Tutor at python.org
>>> To unsubscribe or change subscription options:
>>> http://mail.python.org/**mailman/listinfo/tutor<http://mail.python.org/mailman/listinfo/tutor>
>>>
>>
>>
>>
>> --
>> Best Regards,
>>
>> lina
>>
>>
>>
>
>
> --
> Best Regards,
>
> lina
>
>
>

-- 
Best Regards,

lina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20111007/778faa03/attachment-0001.html>