[DB-SIG] Two-phase commit API proposal (was Re: Any standard for two phase commit APIs?)

Wed Jan 23 02:18:48 CET 2008

On 22/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> On 2008-01-22 12:33, James Henstridge wrote:
> > On 22/01/2008, M.-A. Lemburg <mal at egenix.com> wrote:
> >> Thanks. I like it a lot, except for making the XID an object - this
> >> always appears to be a string in all the backends you've checked and
> >> also in the XA standard, so I'd go for a simple string instead of
> >> an object (those are always lots of work to do at C level).
> >
> > In at least MySQL and Oracle, the transaction ID appears to be more
> > than just a string: it is structured into three parts:
> >  * a format ID
> >  * a global transaction ID
> >  * a branch qualifier
> >
> > Stuart has made the argument that the format ID is not important for
> > Python, and I tend to agree (or at least I don't know what situations
> > you'd use it).
>
> The format id is only used to specify the format of the data
> structure in the XA xid_struct_t:
>
> From http://www.opengroup.org/onlinepubs/009680699/toc.pdf:
>
> """
> Although "xa.h" constrains the length and byte alignment of the data element within an
> XID, it does not specify the data's contents. The only requirement is that both gtrid and
> bqual, taken together, must be globally unique. The recommended way of achieving
> global uniqueness is to use the naming rules specified for OSI CCR atomic action
> identifiers (see the referenced OSI CCR specification). If OSI CCR naming is used, then
> the XID's formatID element should be set to 0; if some other format is used, then the
> formatID element should be greater than 0. A value of -1 in formatID means that the
> XID is null.
> The RM must be able to map the XID to the recoverable work it did for the
> corresponding branch. RMs may perform bitwise comparisons on the data
> components of an XID for the lengths specified in the XID structure. Most XA routines
> pass a pointer to the XID. These pointers are valid only for the duration of the call. If
> the RM needs to refer to the XID after it returns from the call, it must make a local copy
> before returning.
> /*
> * Transaction branch identification: XID and NULLXID:
> */
> #define XIDDATASIZE 128 /* size in bytes */
> #define MAXGTRIDSIZE 64 /* maximum size in bytes of gtrid */
> #define MAXBQUALSIZE 64 /* maximum size in bytes of bqual */
> struct xid_t {
> long formatID; /* format identifier */
> long gtrid_length; /* value 1-64 */
> long bqual_length; /* value 1-64 */
> char data[XIDDATASIZE];
> };
> typedef struct xid_t XID;
> """
>
> So, essentially, only the global transaction id and the branch id
> are relevant and both are represented in the data string.

One interesting part of that is the "If OSI CCR naming is used, then
the XID's formatID element should be set to 0; if some other format is
used, then the formatID element should be greater than 0."

I took a quick look at a few J2EE servers (which use XA), to see what
they do for transaction managers.  Neither JBoss or Geronimo seem to
use formatID=0, but instead use magic numbers that I presume are
intended to determine if they created the transaction ID.

That said, the selection of format identifiers seems a bit ad-hoc:
Geronimo uses 0x4765526f, which has a byte representation of "GeRo".

It seems that you could do pretty much the same thing by getting TMs
to check the global ID itself ...

> BTW, there's a nice extension module that let's you hook Python
> between the TM and RM using XA:
>
>     http://www.hare.demon.co.uk/pyxasw/

>
> > I do see a use for the branch qualifier though.  In a distributed
> > transaction, each resource should have a different transaction ID that
> > share a common global transaction ID but separate branch qualifiers.
> >
> > As transaction IDs are global within database clusters for some
> > backends (PostgreSQL, MySQL and probably others), the branch qualifier
> > is necessary if two databases from the cluster are used in the global
> > transaction.
> >
> > I think it is worth making the API such that it is easy to program to
> > best practices.
>
> The DB-API has always tried to not get in the way of how
> a particular backends needs its configuration data, so
> I think we can still have a single string using a database
> backend specific format. This could then include one or more
> of the above id parts.
>
> The implementation can then decode the string representation
> of the transaction id components into whatever format is
> needed by the backend.

The two reasons I see for using an object to represent transactions
that contains a global part and branch part are:

1. round tripping a transaction ID from xa_recover() to
xa_commit()/xa_rollback().
2. Reduced restrictions on the contents of the transaction ID.

For (1), using a database adapter defined object means that it can
represent transactions that originated elsewhere, or expose more
information about those transactions.

For (2), if a database is using specially formatted transaction IDs at
the Python level that get decoded into the various components, does
that mean that the application or transaction manager glue needs to
know how to format the IDs.

In contrast, it is pretty easy for e.g. a Postgres adapter to
serialise/deserialise a multi-part ID (and this is what the JDBC
driver does).

> >> Regarding the "xa_" prefix, I'm not much attached to it, but since
> >> the interface does indeed look a lot like the XA interface, why not
> >> make that reference ?
> >
> > Stuart's argument is that if the API differs from XA then using the
> > xa_* prefix could be problematic for adapters that want to expose the
> > XA API.
> >
> > As I don't have any experience with using XA, I can't comment one way
> > or the other about this.
>
> Fair enough. The API does resemble XA a lot, but you're right:
> if there are differences, it's better not to make that link.
>
> >> It also makes it clear, that the interface
> >> sits on top of the standard DB-API connection API and that those
> >> methods form a unit.
> >
> > Having a common prefix seems sensible.  If we don't use xa_*,
> > Federico's suggestion of tpc_* might make sense.
>
> Fine, let's use "tpc_".
>
> >> Plus they are currently not in use by any DB-API module, so don't
> >> interfere with existing APIs.
> >
> > So I guess it comes down to the following questions:
> > 1. Are database adapters likely to want to expose more than what is
> > covered by this proposal?
> > 2. Would this proposed API conflict with those extensions?
> >
> > It isn't clear to me that people want to provide a larger API, since
> > the few adapters that have added 2PC support have done so with APIs
> > that are effectively a subset/simplification of this one.
>
> If there's more to expose than what's in the API spec, then
> module authors are free to do so.
>
> In general, the DB-API only
> defines a fully functional common subset of what has to be
> there to use a database backend. Extensions are possible and
> welcome.

I agree with this, and think it is worth keeping extensibility in mind
when designing the API.  My suggestion of using an object to represent
a transaction ID was to make it easier for an adapter to expose more
complex IDs in a fairly localised fashion.

> Every now and then, we consider adding those extensions as
> "standard extensions" to the DB-API. This has proven to work
> well in the past.
>
> The two-phase commit methods would be another set of those
> extensions.

Okay.

James.