[DB-SIG] DB scheme question

Sat Dec 11 19:18:33 CET 2010

Hello,

Before I can ask my question, I have to present the data I work with.
My goal is to create a (very) simple version control for a unique type of
(text) file.
The file has three parts:
1. atoms - set of attributes (e.g. name, id, etc). Each version of the file
contains up to thousands of atoms
2. molecules - has some attributes (e.g. name, id, etc) and a group of atoms
(list of atoms ids). Each molecule can contain zero to hundreds of atoms.
3. matter - a group of molecules (a matter to molecule is what molecule is
to atom - and no, it's not homework of some kind....)

---------------------- EXAMPLE FILE: --------------------
atoms:
# atr1 atr2 ... atrN
1  A     B       C
2  D     B       E
3  D     F       G
...
molucules:
# atr1 atr2 ... atrN atom_list
1  A     B       C    1 5 6 16 53 102
2  D     E       F    2 7 13 53 104 205 206 207
3  G     H       I    4
...
matter:
# atr1 atr2 ... atrN molecules_list
1  A     B       C    1 5 6 16 53 102
2  D     E       F    2 7 13 53 104 205 206 207
3  G     H       I    4
...
------------------- END OF EXAMPLE ---------------------

Each file version is only slightly different from the previous version
(changes in only few atoms, molecules and\or matter).

I thought on the following scheme (first 6 tables are trivial):
    - version table - each raw represent a file that was added
    - atom table - each raw represent one atom from a file
    - molecules table - each raw represent one molecule from a file (later
I'll connect atoms to molecules)
    - matter table - each raw represent one matter from a file (later I'll
connect molecules to matter)
    - atom to molecule table - foreign keys from both tables represent the
relation: atom BELONG to molecule
    - molecule to matter table - foreign keys from both tables represent the
relation: molecule BELONG to matter
    - data to version table - In order not to make 3 different tables
connecting atoms, molecules and matter to version, I will create one table
that contains 2 foreign keys (data_id and version_id) together with another
attribute stating the source table for the data. Now, because versions are
only slightly different, does it make sense to implement the relation: data
DOES NOT BELONG to version ?

My second question is about inserting new versions to the db. After parsing
the text file I have an object with all the information from the file. Is
the next step to go attribute by attribute and update the db scheme or
should I create some sort of temporary scheme (containing only data from the
object) and do the inserting by merging tables? further more, is there a
technique to keep db integrity while updating (especially in case of error
in the insert process)?

As always - thanks for your time and patience

-- 
View this message in context: http://old.nabble.com/DB-scheme-question-tp30434629p30434629.html
Sent from the Python - db-sig mailing list archive at Nabble.com.