[Types-sig] Re: [ZODB-Dev] Pre-announce: Oscar 0.1

Greg Ward gward@mems-exchange.org
Tue, 21 Aug 2001 19:24:19 -0400


On Mon, 20 Aug 2001, I wrote:
> several months ago, I cooked up a tool, Oscar for rigorously
> type-checking a Python object graph: you define an object schema
> (currently through specially-formatted class docstrings), and Oscar

First of all, let me modify the announcement a bit.  There are too many
software packages out there already called "Oscar" -- so Oscar is dead,
long live the Grouch!  (Does anyone who grew up outside of North America
get it?  Oh well...)

On 21 August 2001, Christian Robottom Reis said:
> Currently meaning this will change predictably? Let me just say this is
> quite nice, and I'll have to implement something like this in our Domain
> classes _soon_. My question is:

As far as I am concerned, Grouch will *always* support extracting an
object schema from docstrings.  The MEMS Exchange has ~100 classes in
20,000 lines of code that use Grouch's docstring format for database
type-checking, and another 100 classes that aren't in the object schema
but use the same docstring format for clarity and consistency.

When/if a new schema language becomes part of Grouch, it will be offered
as a complement to schema extraction from docstrings.

> Does the typesystem offer any introspection? I.E., can I in runtime
> discover the attributes registered for my class, and what types they are?
> I need this for type-checking when sorting columns in my UI framework, so
> that would come in handy.

Yes, although it's a tad clunky right now.  Eg. say I have a schema in
schema.pkl, as created by the gen_schema script:
  >>> from cPickle import *
  >>> schema = load(open("schema.pkl"))
  >>> cdef = schema.get_class_definition("mems.access.user.User")
  >>> cdef
  <ClassDefinition at 81a12d4: mems.access.user.User>

OK, you want to know what attributes this class has?
  >>> cdef.attrs
  ['user_id', 'password_hash', 'prefix', 'first_name', 'last_name',
  'suffix', 'address', 'email', 'phone', 'fax', 'timezone',
  'allow_mailing', 'group_list', 'history']

You want to know what type various attributes are?
  >>> cdef.get_attribute_type('password_hash')
  <AtomicType at 81475e4: string>
  >>> cdef.get_attribute_type('address')
  <AliasType at 81a176c: Address>

Hmmm, the 'address' attribute is an alias type -- let's expand the
alias to see what it really is:
  >>> schema.get_alias('Address')
  <InstanceType at 8149694: mems.lib.address.Address>

Digging up the class definition for this is more awkward than it needs
to be:
  >>> name = schema.get_alias('Address').get_class_name()
  >>> name
  'mems.lib.address.Address'
  >>> cdef2 = schema.get_class_definition(name)
  >>> cdef2
  <ClassDefinition at 81489ec: mems.lib.address.Address>

And now we can get the list of attributes in *this* class:
  >>> cdef2.attrs
  ['street1', 'street2', 'street3', 'city', 'state', 'zip',
  'country_code']

...and around we go.  You get the picture.  The documentation for this
API is all in the code.

> Oh, this _is_ a runtime typecheck? :)

Not currently.  Right now, we do a type-checking pass nightly on our
database.  So far, it mostly finds documentation errors -- ie. it's
mainly peace-of-mind, rather than something that regularly finds bugs.

The main reason Grouch doesn't step in at run-time is because I'm afraid
of the performance hit.  The implementation right now concentrates on
correctness and completeness, with efficiency to come later.  I don't
even have performance figures at hand, although the MEMS Exchange
database (140,000 objects in a 45 MB ZODB FileStorage) is a pretty good
test case.

Ideas for run-time:
  * invoke Grouch in __getstate__(), so your objects are checked before
    they're actually written
  * invoke Grouch in __setattr__(), so every attribute is checked at
    assignment time (this is the one that really scares me, performance-
    wise)

The __getstate__() hook ought to be doable if you have a common base
class for all your database classes.  The __setattr__() hook would be
scary for ZODB apps right now; probably best to wait until Python 2.2 is
out and Persistent has been rewritten as a meta-class.  (Or whatever is
going to happen to Persistent.)

> > In the past few weeks, I finally got around to writing the scripts and
> > documentation necessary to release Oscar publicly.  Now I'm ready to do
> > so, pending approval by the CNRI brass (sigh).  There's nothing
> 
> How's it going?

Pretty good, actually.  The actual idea of releasing the code went down
pretty well; nailing down a license took a bit longer.  It's basically
the same as the Quixote 0.3 license; the main problem is that --
according to the FSF -- it's not GPL-compatible.  To be honest, I'm not
entirely certain what this means, but I don't think it matters as much
for Python applications/libraries as it does for Python itself.

> Ok, assuming this will be released, can I go ahead and docstring my
> classes and assume I'm going to be able to do the runtime checking when
> it's available? I had devised a complete typesystem based on dictionaries,
> but if yours is ready and tested, I'll chuck mine.

Give it a whirl -- pre-release tarball is at
http://starship.python.net/~gward/Grouch-0.1.tar.gz.  Shh!  Don't tell
anyone I mentioned this.  It's *not* the final release.  There will be
another Grouch-0.1.tar.gz with possible differences in a few days.

        Greg
-- 
Greg Ward - software developer                gward@mems-exchange.org
MEMS Exchange                            http://www.mems-exchange.org