[Tutor] how to extract data only after a certain condition is met

Emile van Sebille emile at fenx.com
Sun Oct 10 22:54:16 CEST 2010


On 10/10/2010 12:35 PM Josep M. Fontana said...
<snip>
> OK. Let's start with -b- . My first problem is that I don't really know how
> to go about building a dictionary from the file with the comma separated
> values. I've discovered that if I use a file method called 'readlines' I can
> create a list whose elements would be each of the lines contained in the
> document with all the codes followed by comma followed by the year. Thus if
> I do:
>
> fileNameCentury = open(r
> '/Volumes/DATA/Documents/workspace/GCA/CORPUS_TEXT_LATIN_1/FileNamesYears.txt'
> ).readlines()
>
> Where 'FileNamesYears.txt' is the document with the following info:
>
> A-01, 1278
> A-02, 1501
> ...
> N-09, 1384
>
> I get a list of the form ['A-01,1374\rA-02,1499\rA-05,1449\rA-06,1374\rA-09,
> ...]
>
> Would this be a good first step to creating a dictionary?

Hmmm... It looks like you got a single string -- is that the output from 
read and not readlines?  I also see you're just getting \r which is the 
Mac line terminator.  Are you on a Mac, or was 'FileNamesYears.txt' 
created on a Mac?.  Python's readlines tries to be smart about which 
line terminator to expect, so if there's a mismatch you could have 
issues related to that.  I would have expected you'd get something more 
like: ['A-01,1374\r','A-02,1499\r','A-05,1449\r','A-06,1374\r','A-09, ...]

In any case, as you're getting a single string, you can split a string 
into pieces, for example, print "1\r2\r3\r4\r5".split("\r").  That way 
you can force creation of a list of strings following the format 
"X-NN,YYYY" each of which can be further split with xxx.split(","). 
Note as well that you can assign the results of split to variable names. 
  For example, ky,val = "A-01, 1278".split(",") sets ky to A-01 and val 
to 1278.  So, you should be able to create an empty dict, and for each 
line in your file set the dict entry for that line.

Why don't you start there and show us what you get.

HTH,

Emile



More information about the Tutor mailing list