sednaobject - Pythonic interface to Sedna XML Database
Jim Washington
jwashin at vt.edu
Wed Apr 9 14:58:18 CEST 2008
I've put out a new alpha (0.10alpha2) of zif.sedna, a Python adapter to
Sedna, a multi-user XML database, at the Python Cheese Shop.
The new alpha has a start at objectifying XML from the Sedna database in
a manner kind of like sqlobject does for SQL. The aim is to make easy,
pythonic CRUD (create, read, update, delete) operations for the XML
store. Push data in, get data back out.
sednaobject, now included in zif.sedna, provides three classes.
First is SednaXQuery, which gives you list-like semantics for the
results of any arbitrary XQuery or XPath expression. You init it with a
Sedna cursor, a query statement, and an optional parser method. Then,
you can iterate, or obtain items by index or slice. If you do not
provide a parser, you get items as unicode strings.
Second is SednaContainer, which behaves like a SednaXQuery, except it is
read-write. Like SednaXQuery, you init it with a cursor, a query
statement, and an optional parser method. The query statement must
refer to exactly one element in the database. This is the container,
and you can obtain and replace items in the container by index. Slicing
works for retrieval, and append, remove, insert, and del work as per
the elementtree API.
Third is SednaObjectifiedElement, which also operates on a single
element in the database. SednaObjectifiedElement is a thin wrapper
around lxml.objectify. Alter the item with the objectify API, and
save(). Thanks, lxml team, for making this really easy!
Since, in XML, an element is an element is an element, you can use the
second and third sednaobject classes on any element in the database.
Which you would use in a situation depends on the aspect you are
interested in at the moment.
I see Sedna as an attractive middle ground between SQL databases and
object databases like ZODB. Data size is practically unlimited. You
can alter a small portion of a data set transactionally, in a multi-user
environment, without a full rewrite of the data. Like SQL databases, it
uses a query language to obtain and format just the data you want, from
anywhere in the database. XQuery has nice built-in functionality for
counting, filtering, reordering, doing math, etc., on items. Like ZODB,
you can store and retrieve items of arbitrary complexity without too
much fuss. A Sedna database can have multiple XML documents and
multiple 'collections' of (similar) documents that can be queried
together or separately.
The Sedna team just released version 3.0 of the Sedna server, which has
improved speed and reliability. 3.0 now runs on Mac OSX, in addition to
x86 Linux and Win2K/XP.
zif.sedna with sednaobject version 0.10 is alpha, so interfaces can and
probably will change. The included doctests all pass using a Sedna 2.x
server. I have not included the new features of 3.0 (e.g., faster,
read-only queries) yet. Testing with a 3.0 server results in a single
harmless failure. Speed? I'm getting 60-70 single-query transactions
per second through Pylons on a 2Ghz Opteron. Transaction speed of
course depends on how many queries are in the transaction and what the
queries do.
zif.sedna: http://pypi.python.org/pypi/zif.sedna/
Sedna: http://modis.ispras.ru/sedna/
I am currently the sole developer for zif.sedna. Feedback is welcome.
- Jim Washington
More information about the Python-announce-list
mailing list