[XML-SIG] Interface for XSD datatypes
Thomas B. Passin
tpassin@home.com
Thu, 14 Feb 2002 22:37:31 -0500
[Andrew Kuchling]
Andrew, this is very interesting. I've not worked with datatype libraries
so I'd just be guessing at what would be useful. Still, I have a few
suggestions.
> RELAX NG requires support for the primitive datatypes defined in part
> 2 of the XML Schema standard. Support for datatypes would be useful
> outside of the RELAX NG processor, if, for example, someone wanted to
> implement XML Schema in Python or if other standards adopt the XML
> Schema datatypes. Therefore, I'd like to get comments on a proposed
> interface for datatypes.
>
>...
> class DatatypeLibCollection:
> """
> Holder of a bunch of datatype libraries.
> """
>
> def register (self, library):
> """register(library : DatatypeLibrary)
> Add 'library' to the list of libraries available in this
collection.
> """
>
> def has_uri (self, uri):
#### suggest naming this "is_registered". I assume that we are going to
identify
#### a library by a uri?
> """has_uri(uri:string) : boolean
> Returns true if there's a datatype library registered for the
given
> URI.
> """
>
> def has_type (self, (uri, type)):
> """has_uri((uri:string,type:string)) : boolean
> Returns true if the specified datatype exists.
> """
#### An apparent typo, the intended form being
#### has_type((uri:string,type:string)) : boolean
> def check (self, (uri,type), params, value):
> """check((uri:string,type:string),
> params: {string : string},
> value : string) : boolean
> Returns true if the string 'value' represents a legal
> value for the datatype selected by the (uri,type) pair,
> taking the additional parameters 'params' into account.
> If this method returns false, it might mean either
> that the uri isn't for any registered library, or
> that the library is OK and the value is wrong.
> XXX should this return an explanatory message, or should
> there be a third function (explain()?)
> """
#### Wouldn't you sometimes want to know which library (or libraries)
#### said this value was legal?
> ...
> def evaluate (self, (uri,type), params, value):
> """evaluate((uri:string,type:string),
> params: {string : string},
> value : string) : any
>
> Evaluate the string 'value' as a value of the datatype
> selected by the (uri,type) pair, taking the additional
> parameters 'params' into account. Raises DatatypeValueError
> if the value is illegal; raises DatatypeURIError if there's
> no library registered for that URI.
> """
#### What would "evaluate" mean here? What would its semantics be?
#### What gets returned?
#### I'd like a way to enumerate the registered libraries:
def list_libs():
'''Return a list of registered libraries (or maybe their uri values),
or maybe a dictionary would be better'''
>
> class DatatypeLibrary:
> """
> Instance attributes:
> uri : string
> Namespace URI for this collection of data types
> """
>
> def evaluate(self, type, params, value):
> """evaluate(type:string,
> params: {string : string},
> value : string) : any
>
> Evaluate the string 'value' as a value of the datatype
> selected by the name 'type', taking the additional parameters
> 'params' into account. Raises DatatypeValueError with an
> explanatory message if the value is illegal.
> """
#### Same question as for the collection.
> def has_type (self, type):
> """has_type(type:string) : boolean
> Returns true if the library supports a type with the given name.
> """
#### Might not some libraries want to use a uri:string pair or
#### some other structured convention? Maybe there is a way
#### to allow more flexibility in the definition of "type". Maybe "type"
could
#### be an abstract type that, in some subclasses of DatatypeLibrary,
#### would be a string.
> def is_type_legal (self, type, params):
> """is_type_legal(type:string, params:{string:string}) : boolean
> Returns true if the type and corresponding parameters are legal.
> """
#### Of course, the comment above would ripple through other methods.
>
> def get_xsd_library ():
> """get_xsd_library(): DatatypeLibrary
> Return the library for the XML Schema standard's primitive
> datatypes.
> """
#### Wouldn't this be a method on the collection?
#### I'd like to see a way to list the types in a given library, some
#### way to enumerate through them:
def list_datatypes()
'''Returns a list of datatypes in the library'''
> I envision the RELAX NG interface ultimately looking like this:
> schema = relaxng.parse(...) # Get a relaxng.Schema object with a
> # .datatypecoll attribute.
> # It will initially have just the
> # XML Schema library registered.
>
> # Add a custom type library
> schema.datatypecoll.register(myTypeLibrary)
> # Check a document against the schema
> print schema.is_valid(...)
>
> The code will live in sandbox/datatype for now. Should it go in the
> XML tree eventually, or be a separate library? If the former, where
> should it go? xml.schema.datatype, xml.datatype, ... ?
>
If we may end up with support for more than one type of schema, perhaps
xml.schema.datatypes would be good.
Cheers,
Tom P