Sorry about the confusion with the subject line. I was receiving messages in digest mode and I copied and pasted the wrong heading in my previous message. Now I have written the heading corresponding to my initial message. I have also changed the settings for this list from the digest mode to the default mode because it is easier to manage if you are participating in threads.<div>
<br></div><div>OK, first thanks Emile and Bob for your help. </div><div><br></div><div>Both of you noticed that the following line of code returned a string instead of a list as it would be expected from using .readlines(): </div>
<div><br></div><div>open(r'/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt').readlines()</div><div><br></div><div>returns --> ['A-01,1374\rA-02,1499\rA-05,1449\rA-06,1374\rA-09, ...']</div>
<div><br></div><div>Yes, I had not noticed it but this is what I get. You guessed correctly that I am using a Mac. Just in case it might be useful, I'm also using PyDev in Eclipse (I figured since I'm learning to program, I can start using an IDE that will grow with my programming skills).</div>
<div><br></div><div>I tried your suggestion of using .split() to get around the problem but I still cannot move forward. I don't know if my implementation of your suggestion is the correct one but here's the problem I'm having. When I do the following: </div>
<div><br></div><div>-----------------</div><div>
<p class="p1"><span class="s1">fileNameCentury = open(r</span>'/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt'<span class="s1">.split(</span>'\r'<span class="s1">))<br></span>dct = {}<br>
<span class="s2">for</span> pair <span class="s2">in</span> fileNameCentury:<br> key,value = pair.split(<span class="s3">','</span>)<br> dct[key] = value<br>print<span class="s1"> dct</span></p></div><div>--------------</div>
<div><br></div><div>I get the following long error message:</div><div><br></div><div>
<p class="p1">-------------------</p><p class="p1">pydev debugger: warning: psyco not available for speedups (the debugger will still work correctly, but a bit slower)</p>
<p class="p1">pydev debugger: starting</p>
<p class="p1">Traceback (most recent call last):</p>
<p class="p2"><span class="s1"> </span><span class="s2">File "/Applications/eclipse/plugins/org.python.pydev.debug_1.6.3.2010100513/py</span>src/pydevd.py", line 1145, in <module></p>
<p class="p1"> debugger.run(setup['file'], None, None)</p>
<p class="p2"><span class="s1"> </span><span class="s2">File "/Applications/eclipse/plugins/org.python.pydev.debug_1.6.3.2010100513/py</span>src/pydevd.py", line 916, in run</p>
<p class="p1"> execfile(file, globals, locals) #execute the script</p>
<p class="p2"><span class="s1"> </span><span class="s2">File "/Volumes/DATA/Documents/workspace/GCA/src/file_name_change.py", line 2, </span></p>
<p class="p2"><span class="s2">in <module></span></p>
<p class="p1"> fileNameCentury = open(r'/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt'.split('\n'))</p>
<p class="p1"><span class="Apple-style-span" style="background-color: rgb(255, 255, 102); ">TypeError: coercing to Unicode: need string or buffer, list found</span></p></div><div>------------</div><div><br></div><div>Before reporting this problem, I did some research on the newline problems and I saw that you can set the mode in open() to 'U' to handle similar problems. So I tried the following:</div>
<div><br></div><div>
<p class="p1"><span class="s1"> >>>fileNameCentury = open(r</span>'/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt'<span class="s1">, </span>"U"<span class="s1">)</span></p>
<p class="p2"> >>>output = fileNameCentury.readlines()</p>
<p class="p2"><span class="s2"> >>>print</span> output</p><p class="p2"><br></p><p class="p2">Interestingly I get closer to the solution but with a little twist:</p><p class="p2">
</p><p class="p1">['A-01,1374\n', 'A-02,1499\n', 'A-05,1449\n', 'A-06,1374\n', 'A-09,1449\n', 'B-01,1299\n', 'B-02,1299\n', 'B-06,1349\n'...]</p><p class="p1">
That is, now I do get a list but as you can see I get the newline character as part of each one of the strings in the list. This is pretty weird. Is this a general problem with Macs?</p><p></p><p class="p2"><br></p><p class="p2">
Josep M.</p></div><div><br></div><div><br></div><div><br></div><div><div>From: Emile van Sebille <<a href="mailto:emile@fenx.com">emile@fenx.com</a>></div><div>To: <a href="mailto:tutor@python.org">tutor@python.org</a></div>
<div><br></div><div>On 10/10/2010 12:35 PM Josep M. Fontana said...</div><div><snip></div><div>></div><div>> fileNameCentury = open(r</div><div>> '/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt'</div>
<div>> ).readlines()</div><div>></div><div>> Where 'FileNamesYears.txt' is the document with the following info:</div><div>></div><div>> A-01, 1278</div><div>> A-02, 1501</div><div>> ...</div>
<div>> N-09, 1384</div><div>></div><div>> I get a list of the form ['A-01,1374\rA-02,1499\rA-05,1449\rA-06,1374\rA-09,</div><div>> ...]</div><div>></div><div>> Would this be a good first step to creating a dictionary?</div>
<div><br></div><div>Hmmm... It looks like you got a single string -- is that the output from</div><div>read and not readlines? I also see you're just getting \r which is the</div><div>Mac line terminator. Are you on a Mac, or was 'FileNamesYears.txt'</div>
<div>created on a Mac?. Python's readlines tries to be smart about which</div><div>line terminator to expect, so if there's a mismatch you could have</div><div>issues related to that. I would have expected you'd get something more</div>
<div>like: ['A-01,1374\r','A-02,1499\r','A-05,1449\r','A-06,1374\r','A-09, ...]</div><div><br></div><div>In any case, as you're getting a single string, you can split a string</div>
<div>into pieces, for example, print "1\r2\r3\r4\r5".split("\r"). That way</div><div>you can force creation of a list of strings following the format</div><div>"X-NN,YYYY" each of which can be further split with xxx.split(",").</div>
<div>Note as well that you can assign the results of split to variable names.</div><div> For example, ky,val = "A-01, 1278".split(",") sets ky to A-01 and val</div><div>to 1278. So, you should be able to create an empty dict, and for each</div>
<div>line in your file set the dict entry for that line.</div><div><br></div><div>Why don't you start there and show us what you get.</div><div><br></div><div>HTH,</div><div><br></div><div>Emile</div></div><div><br></div>