[Persistence-sig] Re: [PyPerSyst-Devel] terminology for entity-type

Sat Sep 13 15:45:32 EDT 2003

Donnal Walter <donnalcwalter at yahoo.com> writes:

> If one has a collection (set) of data-entities, each of which is of
> the same class, what do you call that class relative to the data
> collection?

Most OO people don't.  Most call the class an Entity.  And the set of
all instances of a particular Entity class (such as your MedOrder
entity class) is usually called an extent.  Unfortunately, extents get
short-changed in the academic literature, imo.  If you look at
PyPerSyst you'll see that I have an actual Extent class that does most
of the hard work.

Here is a little table of terms that might be useful:

OO            Relational Theory  Typical DBMS
------------  -----------------  ------------------
extent        relation           table
entity class  ???                SQL DDL
instance      tuple              row
attribute     attribute          column or field

Most people make the mistake of thinking that the entity class and the
relational table are the common elements.  They are not.  It is the
extent that is most similar to the relational table.  The entity class
is most like the SQL DDL, ie, the Create Table statements.  Overlook
this at your own peril.  ;-)

[I no longer have any Codd & Date books, so I'm a little fuzzy on
exactly how they define "relation" and "tuple", and whether they have
a term for the concept embodied by an OO class.  But I'm willing to
bet I'm pretty close.]

> For example, let's say that we have a set of medication orders
> (named, say, MedList, even though it is not a true list), with every
> item of the set being an instance of the class MedOrder.  I'd like
> to say that the class MedOrder is the _______ of the set MedList.

I don't know what you'd call it, because you're phrasing it in the
reverse order of how and entity and its extent are typically
described.  For example, you usually have one Entity class for each
major "thing" in your application domain, such as: Person,
Organization, Order, OrderDetail, Product, Employee, etc.  Then you
would say something like "The Person extent is the set of all
instances of the Person class."  And extents usually go without a
name.

In PyPerSyst, the extent instance for a particular class is just an
instance of Extent, and the instance is a class level attribute of the
particular Entity subclass.  In your case I would create something
like this:

class MedOrder(Entity):
   ...

And then the extent (which gets hooked up and maintained behind the
scenes by PyPerSyst) can be referenced in two ways.  The first way is
as an attribute of the MedOrder class (via one of its instances):

# Use the Create transaction.
t = tx.Create('MedOrder', med='Foo', date='09-13-2003')
# Execute the transaction, which returns a MedOrder instance.
mo = db.execute(t)
# Reference the extent.
mo.extent.someMethodProvidedByExtent
...

The second way, which is actually the more common case, is to access
the extent from the database root (all branches point to an extent
instance):

# Get the MedOrder extent.
extent = db.root['MedOrder']
# Do something with it, like find all orders for the 'Foo' med.
orderList = extent.find(med='Foo')
for order in orderList:
    print order.date
...

> I have used the word *domain* ("MedOrder is the domain of MedList"),
> but in the little reading that I have done, "domain" seems to be
> related to a column rather than to a row. Likewise for "extent",
> although I'm still not sure I understand extents.

Domain is most often used in two ways.  It is used in a casual way to
refer to the scope of your application, which in your case might be
the medical management domain, or physician treatment domain.

It is also used more formally, particularly in relational theory, to
refer to the set of all possible valid values for a field/column.
This is to distinguish the concept of field domain from field
datatype.  Two fields may have the same datatype (say, string with a
length of 20 characters), and the same purpose (perhaps a Name field),
but that doesn't mean they represent the same domain.

For example, let's say we had a Person table and a Building table,
both of which had Name fields specified in SQL as CHAR(20).  The
domain (set of all possible valid values) for a Person Name is not the
same as the domain for a Building Name.  What that means is that
comparisons between the two fields are not necessarily meaningful.

Did that help?

-- 
Patrick K. O'Brien
Orbtech      http://www.orbtech.com/web/pobrien
-----------------------------------------------
"Your source for Python programming expertise."
-----------------------------------------------