[XML-SIG] Anything else to go in?

Ken MacLeod ken@bitsko.slc.ut.us
14 Oct 1998 11:11:31 -0500


"Andrew M. Kuchling" <akuchlin@cnri.reston.va.us> writes:

> Ken MacLeod writes:
> >The LDOBinary module was implemented as much by empirical testing
> >as by reading pickle.py and others, so I'm not sure how much I can
> >help with a truly lossless implementation :-).  I have a good feel
> >for the pattern I described above so I can help make sure what we
> >come up with is consistent with that.  Let me know what you'd like
> >me to do.
> 
> 	Well, the simple types such as integers and floats are easy.
> They're fairly similar across languages; one might worry about
> different integer sizes and FP precisions, but the unmarshalling
> code might simply implement a "best effort".  For example, on a
> 64-bit platform you might generate <value
> type="integer">18014398509481984</value>, which won't fit in a
> 32-bit integer; a 32-bit platform confronted with this would have to
> deal with it by converting to a long or BigInteger or whatever.
> I'll proceed for now with "integer", "long", "float", "complex",
> "string", and see how that goes, but perhaps the type attribute
> should have values like i32/i64, instead, and be more precise about
> the limits.

Ah, I thought you were talking about a more Python-specific
implementation.  For LDO, you've run into one of our more interesting
design issues, so I welcome more people thinking about it.

LDO is geared towards the scripting, CGI, property list, untyped
container level -- in other words, where strings, numerics, and
booleans are distinguished by usage not by predeclared type.  Even the
more complex objects are typed more by convention than by strict
adherence to an object structure.  In general, most of the languages
we've surveyed handle this well, they either coerce primitive types by
usage or they have explicit interfaces that can be queried to force
coercion.

Python is one of the languages that doesn't fit either category.  I
first recognized this while writing LDOBinary: Python overloads string
operations onto numeric operators.

We discussed this on the Casbah mail list and irc and came to the
preliminary conclusion that we would like to slide Python into the
latter category, where Python clients and servers will need wrappers
or IDLs to talk to non-Python peers.

One specific item that is important: the question is not whether we
can add type information to the format (we can) the question is
whether storage back-ends and other languages have or carry the type
information necessary to be encoded (several don't).

This is a very deep issue and I don't want to make it sound like it's
a done deal, our biggest need is for more information and I don't mind
at all keeping the issue open.  Since we would prefer to keep LDO on the
typeless end, we are currently continuing down that path.

About type sizes, yes it is the receiver's responsibility to check for
overflow.  For explicitly typed languages, the size is known, for
implicitly typed languages the largest type necessary should be used,
Python's BigInteger definitely qualifies here.

> 	Objects are a whole other kettle of fish.  In Python they
> could basically be dictionaries mapping from attribute -> value.
> But how would one specify the class of the object?  In Python
> there's an attribute called __class__, but that shouldn't creep into
> LDO.

On the brighter side :-), when it comes to ``objects''
(a.k.a. property lists, records, entries, etc.) the types or classes
are whatever the creator or owner says they are.  In LDO, it is the
client's responsibility to encode objects with either the server's
desired type/class name or an agreed-upon type/class name.  From the
server side, feel free to copy Python's __class__ to the LDO XML
`type' attributed.

The pattern I've thought of using for the client end is to have a
per-connection type-mapping delegate.  Each client-side LDOConnection
will have a type-mapping delegate that it will provide for the
marshaler (delegates can be share between connections).  The marshaler
can then call the delegate with the local type and it will return the
server's required type, or vice-versa.  The client-side type doesn't
even necessarily have to be a valid local class, as long as it gets
carried with the object and can be handed back to the server as
necessary.

The delegate may be a possible solution to the type issue above,
handling type-conversion as specified by configuration or IDL.  The
delegate may also be a good place to interface to a Python object's
__getstate__ and __setstate__ methods.

-- 
  Ken MacLeod
  ken@bitsko.slc.ut.us