[DB-SIG] DB scheme question

Carl Karsten carl at personnelware.com
Sun Dec 12 06:58:07 CET 2010


On Sat, Dec 11, 2010 at 12:18 PM, John Q. Public <sqlite3.user at gmail.com> wrote:
>
> Hello,
>
> Before I can ask my question, I have to present the data I work with.
> My goal is to create a (very) simple version control for a unique type of
> (text) file.
> The file has three parts:
> 1. atoms - set of attributes (e.g. name, id, etc). Each version of the file
> contains up to thousands of atoms
> 2. molecules - has some attributes (e.g. name, id, etc) and a group of atoms
> (list of atoms ids). Each molecule can contain zero to hundreds of atoms.
> 3. matter - a group of molecules (a matter to molecule is what molecule is
> to atom - and no, it's not homework of some kind....)
>
> ---------------------- EXAMPLE FILE: --------------------
> atoms:
> # atr1 atr2 ... atrN
> 1  A     B       C
> 2  D     B       E
> 3  D     F       G
> ...
> molucules:
> # atr1 atr2 ... atrN atom_list
> 1  A     B       C    1 5 6 16 53 102
> 2  D     E       F    2 7 13 53 104 205 206 207
> 3  G     H       I    4
> ...
> matter:
> # atr1 atr2 ... atrN molecules_list
> 1  A     B       C    1 5 6 16 53 102
> 2  D     E       F    2 7 13 53 104 205 206 207
> 3  G     H       I    4
> ...
> ------------------- END OF EXAMPLE ---------------------
>
> Each file version is only slightly different from the previous version
> (changes in only few atoms, molecules and\or matter).
>
> I thought on the following scheme (first 6 tables are trivial):
>    - version table - each raw represent a file that was added
>    - atom table - each raw represent one atom from a file
>    - molecules table - each raw represent one molecule from a file (later
> I'll connect atoms to molecules)
>    - matter table - each raw represent one matter from a file (later I'll
> connect molecules to matter)
>    - atom to molecule table - foreign keys from both tables represent the
> relation: atom BELONG to molecule
>    - molecule to matter table - foreign keys from both tables represent the
> relation: molecule BELONG to matter
>    - data to version table - In order not to make 3 different tables
> connecting atoms, molecules and matter to version, I will create one table
> that contains 2 foreign keys (data_id and version_id) together with another
> attribute stating the source table for the data. Now, because versions are
> only slightly different, does it make sense to implement the relation: data
> DOES NOT BELONG to version ?

I think I would combine atom, molecule. matter into one table and add
a 3 record type table.  Now there is only 1 table to worry about
versioning.  I think it is better for some reason, but it will make
the code harder to read. I'll confess I don't really know why I would
want to do this, and in my next section I am ignoring this idea.

I would add a column "like" effective_date datetime - when the version
became real.   The reason I say "like" is because effect dates are a
previously solved problem  so is versioning, but I have a feeling you
will find more googling for that term.

Here is what I found:

select tax_rate from t
where effective_date =
(select max(effective_date) from t
  where effective_date <= '2002-04

http://forums.devshed.com/postgresql-help-21/how-do-i-return-a-value-from-one-column-based-144349.html

atoms:
pk id edt  atr1 atr2 ... atrN
1   1  1/1/10     A     B       C
2   2  1/1/10     D     B       E
3   3  1/1/10     D     F       G
...
molucules:
# atr1 atr2 ... atrN atom_list
1  A     B       C    1 5 6 16 53 102


So molecule #1 contains atom 1 (and more...)

Atom 1 changes from ABC to ABD - add this record:

pk id edt  atr1 atr2 ... atrN
4   1  1/2/10     A     B       D

Notice it gets a new primary key, but keeps the ID.

The 1 in atom_list refers to the atom ID.  So to find the atom as it
was on a certain date, use the above query.

follow?

>
> My second question is about inserting new versions to the db. After parsing
> the text file I have an object with all the information from the file. Is
> the next step to go attribute by attribute and update the db scheme or
> should I create some sort of temporary scheme (containing only data from the
> object) and do the inserting by merging tables? further more, is there a
> technique to keep db integrity while updating (especially in case of error
> in the insert process)?
>
> As always - thanks for your time and patience
>

transactions - you don't have to understand the impemtation, but it is
something like:  buffers all the changes until you commit, then locks,
appends, moves some pointers, deletes, unlocks.   It has been a while,
so this may not be right. but most pleople don't worry about how it
works anyway, just that it works.


-- 
Carl K


More information about the DB-SIG mailing list