[python] how to ensure item in list or dict bind with "an uuid meaning" integer type ID?

kee chen keekychen.shared at gmail.com
Mon Jun 28 23:03:53 EDT 2010


Dear all,

I have 2 lists stored in 2 text files may have duplicated records, the raw
data looks like this:
lfruit                                  lcountry
======                                  =========
orange                                  japan
pear                                    china
orange                                  china
apple                                   american
cherry                                  india
lemon                                   china
lemon                                   japan
strawberry                              korea
banana                                  thailand
                                        australia
basically, what I want is:
 1. all of the duplicated records need to be removed and
 2. the unique items need bind with an unique integer ID, something like a
PK in database, no sort needed.
but before you give answer here, pls also read below.

lfruit                                  lcountry
======                                  =========
1    orange                           1  japan
2    pear                             2  china
3    apple                            3  american
4    cherry                           4  india
5    lemon                            5  taiwan
6    strawberry                       6  korea
7    banana                           7  thailand
                                      8  australia

Q1,the items in above lists may need to be added and deleted later, then how
to make the list easy to extend and how to make sure the items have a
sequenced, unique fixed, INTERGET type ID bind with those items?

Here is why I want an INTEGER ID not hash or uuid: the "uuid4" is not
working on my case because I want make that ID may transfer information in
low cost in a MCU protocol style later, I means the INTEGER ID used here
also as the binary stream position id in my protocol, take lfruit data here
for example, a bin stream 0111100 can with the meaning of lfruit items
exists or not.


Also, a combination of 2 lists may needed later to generate new list or
called matrix, also as above, an unique ID is also needed here:
  lcombination =  [lfruit] * [lcountry]
  ============
1    japan  orange                #(1,1)
2    japan  pear                  #(1,2)
3    japan  apple                 #(1,3)
4    japan  cherry                #(1,4)
5    japan  lemon                 ...
6    japan  strawberry            ...
7    japan  banana                ...
8    china  orange                #(2,1)
9    china  pear                  #(2,2)
……
Q2, because the lcombination come from the extendable items in lists, then
how to make sure the unique ID here also is always fixed and unique?

BTW: my original plan is to use dict or list as the runtime data container
and use sqlite as the storage also the assigee of the unique ID , however,
base on answer from

http://old.nabble.com/(python)-how-to-define-unchangeable-global-ID-in-a-table--td29000959.html
it may not just rely on sqlite ensure the unique ID assignee mechanism may
works, then I asks help here, any answer or comment will be highly
appricated!

Thanks,
KC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20100629/fb904824/attachment.html>


More information about the Python-list mailing list