Help with database planning

Ken Seehart ken at seehart.com
Sat Nov 14 13:15:06 CET 2009


Good idea to use Django.  I've just started using it and I really like 
it.  However, I should give you a heads-up: You will probably want to 
use a Django migration tool (I'm using South) because the alternative is 
basically to rebuild your database each time your model changes.  
Unfortunately, South can sometimes get confused when using anything less 
sophisticated than PostgreSQL (I switched from MySQL to PostgreSQL for 
this reason).  I don't know if South or the other various Django 
migration tools work with MySQL.

Applying the DRY (don't repeat yourself), you might even consider 
running the same code as a local web server instead of implementing a 
separate desktop version.  But it is just a suggestion; there are 
various reasons why you might not want to do that.

On to the DB design question...

One approach would be to make a Generic class that can represent a 
concept, slot, or filler, which would have a type field to identify 
which of these to use.

class Concept(models.Model):
    ...

class Slot(models.Model):
    ...

class Filler(models.Model):
    ...

class Facet(models.Model):
    ...

class Generic(models.Model):
    TYPE_CHOICES = (
        (u'c', u'concept'),
        (u's', u'slot'),
        (u'f', u'filler'),
    }

    # Only one of the following is used.  The other two are blank.
    concept = models.ForeignKey(Concept)
    slot = models.ForeignKey(Slot)
    filler = models.ForeignKey(Filler)

class ConceptDef(models.Model):
    concept = models.ForeignKey(Concept)
    slot = models.ForeignKey(Generic)
    facet = models.ForeignKey(Facet)
    filler = models.ForeignKey(Generic)    

 
Juliano wrote:
> Hello, everybody.
>
> I'm a linguist with practical skills on computers/programming.
>
> We've been working with an ontology at my department, and now I need
> to create a GUI viewer for the flat file we have.
> I tried to write an Ontology class which manages the data read and
> parsed from the flat file, but it takes a relatively long time.
> Besides, we have plans to set up a website for online display of said
> ontology. So, I have been being pushed towards changing the basic plan
> and build a DB so that data access will be faster and easier for both
> the desktop GUI and the web app. Right now, I'm trying to work with
> sqlite, since it can be used as a separate file for the GUI and as a
> DB for Django (which may be the choice for the web interface).
>
> I have been redaing some books on DBs but I kind of get stuck when it
> comes to the normalization and the planning of the tables. The problem
> is that we have basically four fields that can be arranged in a tree-
> like structure. Eg:
>
> Concept
>   |----- Slot
>   |        `------ Facet
>   |                  `------ Filler
>   `----- Slot
>            `------ Facet
>                      `------ Filler
>                      `------ Filler
>     ...
>
> So, for ONE *concept*, we have, usually, MANY *slots*, each *slot* has
> ONE *facet*, and each *facet* can have MORE THAN ONE *filler*.
> Besides, some *slots* and *fillers* are themselves *concepts*,
> creating a sort of recursive reference.
>
> <begin table>
> line_no	concepts	slots	facets	fillers
> ------------------------------------------------------------------------------
> 00000	ABANDON	DEFINITION	VALUE	"to leave or desert something or
> someone"
> 00001	ABANDON	IS-A	VALUE	EXIT
> 00002	ABANDON	LEXE	MAP-LEX	"leave behind-V1"
> 00003	ABANDON	LEXE	MAP-LEX	abandon-V1
> (...)
> 97420	ZULU	DEFINITION	VALUE	"a language or dialect spoken in south
> africa and others"
> 97421	ZULU	INSTANCE-OF	VALUE	OTHER-NIGER-KORDOFANIAN-LANGUAGE
> 97422	ZULU	LANGUAGE-OF	INV	LESOTHO
> 97423	ZULU	LANGUAGE-OF	INV	SOUTH-AFRICA
> <end table>
>
> I tried to create index tables for concepts, slots, facets and
> fillers, which gave me the following table:
>
> <begin table>
> line_no	concepts	slots	facets	fillers
> ------------------------------------------------------------------------------
> 00000	cn_00000	sl_00048	fc_00007	fl_07349
> 00001	cn_00000	cn_02605	fc_00007	cn_01768
> 00002	cn_00000	sl_00121	fc_00002	fl_04329
> 00003	cn_00000	sl_00121	fc_00002	fl_15009
> (...)
> 97420	cn_05429	sl_00048	fc_00007	fl_01340
> 97421	cn_05429	cn_02493	fc_00007	cn_03526
> 97422	cn_05429	cn_02750	fc_00001	cn_02816
> 97423	cn_05429	cn_02750	fc_00001	cn_04580
> <end table>
>
> (cn_XXXXX from concept index, sl_XXXXX from slot index,
> fc_XXXXX from facet index, fl_XXXXX from filler index.)
>
> As you can see, only concepts and facets are populated by their own
> type of data.
> Whereas slots and fillers can be populated by their own types or by
> concepts.
>
> What would be a good way to create tables for this situation?
> In fact, this is the first time I've ever tried to create a DB, so I'm
> completely lost.
>
> I'm looking forward to a reply...
>
> Thank you very much,
> Juliano
>   




More information about the Python-list mailing list