[XML-SIG] Inverse relationship design in XML, Python, & Zope
Douglas Bates
bates@stat.wisc.edu
15 Jan 2002 11:43:08 -0600
This is a design question but it does relate to XML, Python, and Zope
so I hope it is not too far off-topic for these lists.
I apologize for the length of the question. It seems that design
questions often require a lot of explanation and background.
I begin with XML representing, say, articles within issues within
a volume of a journal.
<jvol>
<jname>J. of the Amer. Statist. Assoc.</jname>
<volume>98</volume>
<year>2001</year>
<jissue>
<inum>455</inum>
<month>September</month>
<jart>
<ppf>785</ppf>
<ppl>793</ppl>
<atl>Estimation of the ... Microscopy</atl>
<aug>
<au>Higdon, Dave</au>
<au>Yamamoto, Steve</au>
</aug>
<kwdg>
<kwd>Blind deconvolution</kwd>
<kwd>Magnetic imaging</kwd>
</kwdg>
</jart>
<jart>
...
</jart>
</jissue>
<jissue>
<inum>456</inum>
...
</jissue>
</jvol>
This information is decomposed and stored, along with other types of
bibliographic information, in a PostgreSQL database. We use Zope to
provide the web interface for queries on this database. We would
like to have the result of a query returned as an XML object that
contains elements, each of which has complete information on a
title that matches the query, where a title can be an article or a
book or another type of reference. For example, a query could map to
SQL like
SELECT xmlrep FROM titles WHERE idTitle IN
(SELECT idTitle FROM authors WHERE aname ~ 'Higdon, *'
INTERSECT
SELECT idTitle FROM authors WHERE aname ~ 'Yamamoto, *');
The way I envision it, the xmlrep corresponding to that jart element
above would be, say, a pjart element (presentation of a journal
article). It must include all the information on the article and the
issue and the volume and the journal if we are to be able to format
it. The entire result from the query might be
<qresult>
<pjart>
<jname>J. of the Amer. Statist. Assoc.</jname>
<volume>98</volume>
<year>2001</year>
<inum>455</inum>
<month>September</month>
<ppf>785</ppf>
<ppl>793</ppl>
<atl>Estimation of the ... Microscopy</atl>
<aug>
<au>Higdon, Dave</au>
</aug>
<kwdg>
<kwd>Blind deconvolution</kwd>
<kwd>Magnetic imaging</kwd>
</kwdg>
</pjart>
<pbook>
<btl>Title of a book</btl>
<bpub>Book Publisher, Inc.</bpub>
<aug>
<au>Yamamoto, Steve</au>
<au>Higdon, Dave</au>
</aug>
...
</pbook>
</qresult>
This is what I meant by an inverse relationship design. I must get
the information on the journal, volume, and issue when I select the
article.
The simple way to do this is to store the entire xmlrep for each
title in the titles table. It works but it is not exactly elegant
and it requires that we store a lot of redundant text in that xmlrep
column.
The more elegant solution is to construct the information from XML
fragments or text fragments in several different tables in the
database. Our tables are normalized so the journal table, with
primary key idJournal, contains information on the journal, the issn,
etc. and our jour_vol_issue table, with primary key idE, contains an
idJournal, and a volume number, year, and issue number. SQL is
designed to reconstruct this kind of information but it does so by
returning rows in a query result and it would not be easy to make a
single query return, say, information on articles and information on
books.
If we are going to store fragments that will be pasted together to
create the XML for presenting a journal article or a book or a
proceedings article or ..., where should we construct the XML and
what are good tools to use to do this?
We could:
0) Store all the information on each title in the xmlrep column and
not do any processing.
1) Write a database function, say in PL/PGSQL, that constructs this
XML on the fly from the different tables in the database.
2) Write DTML methods, Python methods or external Python methods
for Zope.
3) Use XSLT or Xpath to reconstruct.
I think know how to do 0, 1, or 2. I'm not sure about 3 because I
don't yet know enough about XSL.
Your suggestions will be appreciated.
--
Douglas Bates bates@stat.wisc.edu
Statistics Department 608/262-2598
University of Wisconsin - Madison http://www.stat.wisc.edu/~bates/