Data type ideas
John Roth
johnroth at ameritech.net
Sat Mar 30 07:50:02 EST 2002
This is a sorting problem. If you can't do it in
memory (and I suspect that you probably can't)
look at the unix sort command.
John Roth
"Joel Ricker" <joejava at dragoncat.net> wrote in message
news:mailman.1017469191.15784.python-list at python.org...
HI all, got a new problem :)
I have a tab delimited file of people plus a list of groups they belong
to like so:
Person 1 Group A
Group B
Person 2 Group B
Person 3 Group A
Group C
So basically a person can be part of one of more groups. I'm looking to
process this list so that I can take each group and examine the list of
people in it. Basically turn the list into:
Group A Person 1
Person 3
Group B Person 1
Person 2
Group C Person 3
The drawback I have to all this is, the file I'm working is pretty big:
about 40 megs. A majority of the file is going to be extraneous data
that I have weeded out with regular expressions but it is still a large
data file.
My first (naive) approach was to just create a Dict type using the name
of the group as a the key and for the value a list of people. I learned
that due to the overhead, that was going to take alot of memory and
processing time.
It would look something like this:
{"Group A" : ["Person 1", "Person 3"],
"Group B" : ["Person 1", "Person 2"],
"Group C" : ["Person 3"]}
My next idea was what about references? Maybe create a list of people
and a Dict as above with a list of references to the list of people.
But as I learned you can't do references to simple data objects (like a
subscript of a list). I could be wrong but thats what I gathered. I
tried using a list of integers for the value of the Group Dict,
"pointing" to the list of People:
{"Group A" : [0, 2],
"Group B" : [0, 1],
"Group C" : [2] }
["Person 1", "Person 2", "Person 3"]
This helped a little but obviously not much since it isn't much of a
change from what I've had before.
So what next? Any ideas that I can use?
Thanks
Joel
More information about the Python-list
mailing list