[Tutor] files - strings - lists

Kent Johnson kent37 at tds.net
Wed Nov 23 17:03:50 CET 2005


Andrzej Kolinski wrote:
> 
> I want to create a program that uses data from text files, makes 
> appropriate calculations and produces report. First I need to find out 
> what is the right way to retrieve appropriate information from an input 
> file. This is a typical format of the input file:
> 
> 1 Polonijna Liga Mistrzow        |from the heading portion
> 26 wrzesnia 2005                        |only
>  6 12 *6* *4* 1                                |'6' and '4' will be needed
>  0 1 0
> *Bohossian* - *Kolinski*                |all names and
> 1                                        |all scores
>       1.000 9 *13* 19                |(3rd column -
>       2.000 2 *4* 16                |'13', '4', '8', '6'
>       1.000 10 *8* 17                |will be needed
>       0.000 8 *6* 17                |
> *Szadkowska* - *Szczurek                *|
> 2                                        |same here
>       0.000 11 *16* 20                |
>       3.000 1 *-4* 14                |
>       3.500 3 *-7* 13
>       2.500 10 *13* 19          
> ..................
> 
>  1 1                                        |skip the rest
>  1 1 1                                |(at least for now)

It's pretty simple to make an ad-hoc reader for this data. A couple of things you need:

- You can get individual lines from a file by treating it as an iterator. Instead of the usual
  f = open('data.txt')
  for line in f:
you can call f.next() to get a single line. This makes it easy to skip lines or process lines differently.

The call to f.next() will raise StopIteration when there are no more lines

- You can use split() to break a line into fields, then subscripting to pull out the data you want:
 >>> line = '      1.000 9 13 19'
 >>> line.split()
['1.000', '9', '13', '19']
 >>> line.split()[2]
'13'
 >>> int(line.split()[2])
13


With these tools the solution is pretty simple. I pull the data from a string but it will work with a file as well. I save the results in a dictionary which maps name to a list of scores.

data = '''1 Polonijna Liga Mistrzow
26 wrzesnia 2005
 6 12 6 4 1
 0 1 0
Bohossian - Kolinski
1 
      1.000 9 13 19
      2.000 2 4 16
      1.000 10 8 17
      0.000 8 6 17
Szadkowska - Szczurek
2
      0.000 11 16 20
      3.000 1 -4 14
      3.500 3 -7 13
      2.500 10 13 19          
'''.split('\n')

#lines = open('data.txt')   # to get the data from a real file

lines = iter(data)  # getting data from a string, you don't need this when reading a file

lines.next()    # skip two headers
lines.next()

header = lines.next().split()
six = int(header[2])
four = int(header[3])
print six, four

lines.next()

allScores = {} # accumulate scores into a dictionary whose key is the name

# Now we can process the names and scores in a loop
try:    # you don't say how you know the end of the names, I just run to the end of data
    while True:
        name = lines.next().strip()

        lines.next()    # skip line after name
        scores = [ int(lines.next().split()[2]) for i in range(4) ]

        allScores[name] = scores
    
except StopIteration: # no more lines
    pass
    
for name, scores in allScores.items():
  print name, scores



More information about the Tutor mailing list