[XML-SIG] DOM: Whitespace, and subclassing

Fred L. Drake Fred L. Drake, Jr." <fdrake@acm.org
Tue, 20 Oct 1998 19:59:55 -0400 (EDT)

Andrew M. Kuchling writes:
 > 	Question 1: Does such a lose-the-whitespace method seem
 > generally useful enough to be added to the package?
 > 	2) If so, should it be added to the core, or put in a
 > xml.dom.util module?

  Since this is, as you point out, dependent on an application's
knowledge of the use of the data, it should be separate.  The use of
such processing is highly dependent on application knowledge and the
availability of a validating parser.  Using a validating parser would
allow more general knowledge to be applied and less application
specific interpretation would be needed.  In either case (validating
or non-validating parser), the notion of data normalization (as
opposed to DOM's node-based normalization) should be distinct from the 
DOM implementation.

 > 	3) Other whitespace normalizations are possible, such as
 > dropping leading and/or trailing whitespace on Text nodes, or
 > shortening runs of whitespace characters down to a single character.
 > Should these be made available?  Anyone care to suggest an interface?

  Yes, they should be available, as long as they're clearly documented 
and separate from the actual DOM implementation.  They should be
destructive on the DOM tree; if you want to work on a copy, make a
copy.  Otherwise, I don't know what the interface should be.  I think
a simple function that accepts a two required parameters: the Document 
object and a node to start from (to allow operating on a subtree).
Additional parameters would be specific to the transformation.

 > 	Question: Does subclassing basic DOM classes seem useful for
 > some purpose?  If so, how could it be made possible?  
 > 	Perhaps .create*() could take an optional klass = <class
 > object> argument, and verify that klass is a subclass of the original
 > class.  However, inside core.py, nodes are often created directly,
 > without calling the Document object's .create*() method, in order to
 > save a method call, and that code would have to be changed to use the
 > Document object's factory functions.

  This is a real trade-off problem; performance is generally an issue
with some of the XML stuff I've been doing (because the user is
waiting for a dialog).  Whether the use of subclassed DOM objects
requires the use of factories depends on how the framework operates.
If you want to be able to "install" a subclass as the default for a
particular purpose, maybe.  (Not really, this is Python, but it would
help for maintainability.)  If you only want to be able to create the
subclassed variants on demand, the problem pretty much gets shoved off 
to the builder.
  Perhaps some explanation of the motivation would help make the value 
of this clear.


Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives
1895 Preston White Dr.	    Reston, VA  20191