[XML-SIG] Interface for XSD datatypes

Thomas B. Passin tpassin@home.com
Thu, 14 Feb 2002 22:37:31 -0500


[Andrew Kuchling]

Andrew, this is very interesting.  I've not worked with datatype libraries
so I'd just be guessing at what would be useful.  Still, I have a few
suggestions.

> RELAX NG requires support for the primitive datatypes defined in part
> 2 of the XML Schema standard.  Support for datatypes would be useful
> outside of the RELAX NG processor, if, for example, someone wanted to
> implement XML Schema in Python or if other standards adopt the XML
> Schema datatypes.  Therefore, I'd like to get comments on a proposed
> interface for datatypes.
>
>...
> class DatatypeLibCollection:
>     """
>     Holder of a bunch of datatype libraries.
>     """
>
>     def register (self, library):
>         """register(library : DatatypeLibrary)
>         Add 'library' to the list of libraries available in this
collection.
>         """
>
>     def has_uri (self, uri):
#### suggest naming this "is_registered".  I assume that we are going to
identify
#### a library by a uri?
>         """has_uri(uri:string) : boolean
>         Returns true if there's a datatype library registered for the
given
>         URI.
>         """
>
>     def has_type (self, (uri, type)):
>         """has_uri((uri:string,type:string)) : boolean
>         Returns true if the specified datatype exists.
>         """
#### An apparent typo, the intended form being
####    has_type((uri:string,type:string)) : boolean


>     def check (self, (uri,type), params, value):
>         """check((uri:string,type:string),
>                  params: {string : string},
>                  value : string) : boolean
>         Returns true if the string 'value' represents a legal
>         value for the datatype selected by the (uri,type) pair,
>         taking the additional parameters 'params' into account.
>         If this method returns false, it might mean either
>         that the uri isn't for any registered library, or
>         that the library is OK and the value is wrong.
>         XXX should this return an explanatory message, or should
>         there be a third function (explain()?)
>         """
#### Wouldn't you sometimes want to know which library (or libraries)
#### said this value was legal?
> ...
>     def evaluate (self, (uri,type), params, value):
>         """evaluate((uri:string,type:string),
>                     params: {string : string},
>                     value : string) : any
>
>         Evaluate the string 'value' as a value of the datatype
>         selected by the (uri,type) pair, taking the additional
>         parameters 'params' into account.  Raises DatatypeValueError
>         if the value is illegal; raises DatatypeURIError if there's
>         no library registered for that URI.
>         """
#### What would "evaluate" mean here?  What would its semantics be?
#### What gets returned?

#### I'd like a way to enumerate the registered libraries:
    def list_libs():
    '''Return a list of registered libraries (or maybe their uri values),
    or maybe a dictionary would be better'''

>
> class DatatypeLibrary:
>     """
>     Instance attributes:
>       uri : string
>         Namespace URI for this collection of data types
>     """
>
>     def evaluate(self, type, params, value):
>         """evaluate(type:string,
>                     params: {string : string},
>                     value : string) : any
>
>         Evaluate the string 'value' as a value of the datatype
>         selected by the name 'type', taking the additional parameters
>         'params' into account.  Raises DatatypeValueError with an
>         explanatory message if the value is illegal.
>         """
#### Same question as for the collection.

>     def has_type (self, type):
>         """has_type(type:string) : boolean
>         Returns true if the library supports a type with the given name.
>         """
#### Might not some libraries want to use a uri:string  pair or
#### some other structured convention?  Maybe there is a way
#### to allow more flexibility in the definition of "type".  Maybe "type"
could
#### be an abstract type that, in some subclasses of DatatypeLibrary,
#### would be a string.

>     def is_type_legal (self, type, params):
>         """is_type_legal(type:string, params:{string:string}) : boolean
>         Returns true if the type and corresponding parameters are legal.
>         """
#### Of course, the comment above would ripple through other methods.
>
> def get_xsd_library ():
>     """get_xsd_library(): DatatypeLibrary
>     Return the library for the XML Schema standard's primitive
>     datatypes.
>     """
#### Wouldn't this be a method on the collection?

#### I'd like to see a way to list the types in a given library, some
#### way to enumerate through them:

    def list_datatypes()
    '''Returns a list of datatypes in the library'''

> I envision the RELAX NG interface ultimately looking like this:
> schema = relaxng.parse(...) # Get a relaxng.Schema object with a
>                             # .datatypecoll attribute.
>                             # It will initially have just the
>                             # XML Schema library registered.
>
> # Add a custom type library
> schema.datatypecoll.register(myTypeLibrary)
> # Check a document against the schema
> print schema.is_valid(...)
>
> The code will live in sandbox/datatype for now.  Should it go in the
> XML tree eventually, or be a separate library?  If the former, where
> should it go?  xml.schema.datatype, xml.datatype, ... ?
>

If we may end up with support for more than one type of schema, perhaps
xml.schema.datatypes would be good.

Cheers,

Tom P