[Tutor] creating variable names from string slices

Jeff Shannon jeff@ccvcorp.com
Tue Apr 1 13:24:01 2003

noc wrote:

>I'm telnetting to a mainframe, and capturing a screendump of data that's
>neatly column formatted:
>What I want to do is define a function that gets passed each line, and
>returns a dictionary with the first element  as the name, and the subsequent
>elements as the key:value pairs:

Something like this, perhaps?

First, we get the imported data into a list of lines:

 >>> import pprint
 >>> captured = """Betty     Blonde    Student
... Veronica  Brunette  Student
... Ginger    Redhead   Castaway
... Maryann   Brunette  Castaway"""
 >>> lines = captured.split('\n')
 >>> pprint.pprint(lines)
['Betty     Blonde    Student',
 'Veronica  Brunette  Student',
 'Ginger    Redhead   Castaway',
 'Maryann   Brunette  Castaway']

Now, we write a function that deals with a single line.  Since we know 
the structure of the line, we can take advantage of that fact.  

 >>> def processline(line):
...     fields = ['haircolor', 'role']
...     line = line.split()
...     fielddata = zip(fields, line[1:])
...     return line[0], dict(fielddata)

This function returns a tuple of a name (the first column of input) and 
a dictionary that's constructed from the second and subsequent columns, 
along with the column names.  (If we wanted to get fancy, we might build 
the list of fields from the first line of our input, reading column 
headers, but for now I'm just using a statically defined list of fields.)

Now all we have to do is run each line through this function, and store 
the results.  

 >>> data = {}
 >>> for line in lines:
...     name, info = processline(line)
...     data[name] = info
 >>> pprint.pprint(data)
{'Betty': {'haircolor': 'Blonde', 'role': 'Student'},
 'Ginger': {'haircolor': 'Redhead', 'role': 'Castaway'},
 'Maryann': {'haircolor': 'Brunette', 'role': 'Castaway'},
 'Veronica': {'haircolor': 'Brunette', 'role': 'Student'}}

Looks about right to me!

Depending on how you're using this data, it might also be practical to 
define a class, and make each line into an instance of that class, with 
attributes named 'haircolor' and 'role'.

 >>> class Girl:
...     def __init__(self, name, haircolor, role):
...         self.name = name
...         self.haircolor = haircolor
...         self.role = role
 >>> def processline(line):
...     instance = Girl(*line.split())
...     return instance.name, instance

Note the use of '*line.split()' to feed arguments to Girl.  This has the 
effect of passing each element of the list returned by line.split() as 
an individual parameter, rather than as an aggregated list.  Now we can 
process each line, and do whatever with the resulting objects.

 >>> data = {}
 >>> for line in lines:
...     name, info = processline(line)
...     data[name] = info
 >>> pprint.pprint(data)
{'Betty': <__main__.Girl instance at 0x01608918>,
 'Ginger': <__main__.Girl instance at 0x0162D710>,
 'Maryann': <__main__.Girl instance at 0x01622690>,
 'Veronica': <__main__.Girl instance at 0x015EF3C8>}
 >>> for item in data.values():
...     print item.name, item.role
Veronica Student
Betty Student
Ginger Castaway
Maryann Castaway

Jeff Shannon
Credit International