[Tutor] Dictionaries and multiple keys/values

Dave Angel davea at davea.name
Tue Mar 26 10:44:13 CET 2013


On 03/26/2013 12:36 AM, Robert Sjoblom wrote:
> Hi again, Tutor List.
>
> I am trying to figure out a problem I've run into. Let me first say
> that this is an assignment, so please don't give me any answers, but
> just nudge me in the general direction. So the task is this: from a
> text file, populate three different dictionaries with various
> information. The text file is structured like so:
> Georgie Porgie
> 87%
> $$$
> Canadian, Pub Food
>
> So name, rating, price range, and food offered. After food offered
> follows a blank line before the next restaurant is listed.
>

There are a number of things about the input file that you haven't 
specified, and it's useful to create a running description of the 
assumptions you're making about it.  That way, if one of those 
assumptions turns out to not always be true, you at least have a clue as 
to what might be wrong.

And in real-life problems, you might want to add code to test every one 
of those assumptions, and exit with a clean message when the data 
doesn't meet them.

Examples of such assumptions:

1) the "name" line is unique;  no two records have the same name
2) the rating is always exactly two digits followed by a percent sign, 
even if it's less than 10%.
3) white space may occur before and after the dollarsigns on the 
price-range field, but never on the rating or name lines
4) there will be exactly 5 lines for every record, including the last 
one in the file.

> The three dictionaries are:
> name_to_rating = {}
> price_to_names = {'$': [], '$$': [], '$$$': [], '$$$$': []}
> cuisine_to_names = {}
>
> Now I've poked at this for a while now, and one idea I had, which I
> worked on for quite a while, was that since the restaurants all start
> at index 0, 5, 10 and so on, I could structure a while loop like this:
> with open('textfile.txt') as mdf:
>    file_length = len(mdf.readlines())-1
>    mdf.seek(0)
>    data = mdf.readlines()
>
>    i = 0
>    while file_length > 0:
>      name_to_rating[data[i]] = int(data[i+1][:2])
>      price_to_names[data[i+2].strip()].append(data[i].strip())
>      # here's the cuisine_to_names part
>      i += 5
>      file_length -= 5
>
> And while this works, for the two first dictionaries,  it seems really
> cumbersome -- especially that second expression -- and very, very
> brittle. However, even if I was happy with that, I can't figure out
> what to do in the situation where:
> data[i+3] = 'Canadian, Pub Food' #should be two items, is currently a string.
> My problem is that I'm... stupid. I can split the entry into a list
> with two items, but even so I don't know how to add the key: value
> pair to the dictionary so that the value is a list, which I then later
> can append things to.
>

Nothing stupid about that.  Your only shortcoming is assuming it should 
be a single line doing the assignment.  Once you use 
cuisines.split(something) to make a list of cuisines, you then need to 
loop over them.  And if the cuisine doesn't already exist, you need to 
create the item, while if it does, you need to append to the item.

> I'm sorry, this sounds terribly confused, I know. I had another idea
> to feed each line to a function, because no restaurant name has a
> comma in it, and food offered always has a comma in it if the
> restaurant offers more than one kind. But again, this seems really
> brittle.
>
> I guess we can't use objects (for some reason), but that doesn't
> really matter because if I can't extract the data into dictionaries I
> wouldn't have much use of an object either way. So yeah, my two
> questions are these:
> is there a better way to move through the text file other than a
> really convoluted expression? And how do I add more than one value to
> a key in a dictionary, if the values are added at different times and
> there's no list created in the dictionary to begin with?
>
> (I briefly though about initializing empty lists for each food type in
> the dictionary and go with my horrible expressions, but that seems
> like a cheap way out of a problem I'd rather tackle in a good way to
> begin with)
>
> Much thanks in advance.
>

First thing I'd do to make those lines clearer is to assign temp names 
to each of those fields.  For example, if you say
     name =

then the other places that use name can be much more readable. 
Likewise, if a particular name needs to be stripped or split before 
being assigned, it's in one common place.

So the loop would start with four assignments, capturing usable versions 
of those four lines.  Then you'd have 3 assignments, updating the three 
dictionaries from those four names.  And one of those assignments would 
update multiple dictionary items, it would actually be a loop.

You mention objects, which is one way to make things easier.  But you 
didn't mention functions.  I think it'd be an improvement if each 
dictionary had a function created to do its updating.  Then the loop 
that you're writing here would be four assignments, followed by 3 
function calls.

Finally, you ask if there's a better way than readlines().  I don't 
think there's any harm in doing it this way, though it could take a lot 
of memory if the file is really large.  But why not do a readline() for 
each individual variable?  Then all the bookkeeping of i+3 etc goes away.



-- 
DaveA


More information about the Tutor mailing list