[XML-SIG] More entity stuff

Lars Marius Garshol larsga@ifi.uio.no
24 Jun 1999 00:04:04 +0200


* Dan Libby
| 
| Okay, so I have a DTD with a bunch of entities copied from the html 3.2
| dtd.  They look like this:
| 
| <!ENTITY cent "&#162;">
| 
| When this is run through xmlproc (xmlval), the entities are ignored.
| sort of. 

This is a bug. I've seen it before, but thought I'd fixed it. The
trouble is that the entity is only one character long (after the
character reference is resolved) and that causes xmlproc to screw up
for some reason.

If you insert a space in the declaration (before or after) the
character reference the problem goes away.

This turned out to be a rather subtle problem and finding a solution
that passed the regression test in a satisfying way took a while.
The patches below seem correct, though.

Thanks for reporting this!


=== xmlproc.py
***************
*** 72,78 ****
      def do_parse(self):
  	"Does the actual parsing."
  	try:
! 	    while self.pos+1<self.datasize:
  		prepos=self.pos
  
  		if self.data[self.pos]=="<":
--- 72,78 ----
      def do_parse(self):
  	"Does the actual parsing."
  	try:
! 	    while self.pos<self.datasize:
  		prepos=self.pos
  
  		if self.data[self.pos]=="<":
***************
*** 437,443 ****
              
  	if ent.is_internal():
  	    self.push_entity(self.get_current_sysid(),ent.value)
! 	    self.do_parse()
  	    self.flush()
  	    self.pop_entity()
  	else:
--- 435,447 ----
              
  	if ent.is_internal():
  	    self.push_entity(self.get_current_sysid(),ent.value)
!             try:
!                 self.do_parse()
!             except OutOfDataException: # Ran out of data before done
!                 self.report_error(3001)
!             except IndexError:         # Ran out of data before done
!                 self.report_error(3001)
!             
  	    self.flush()
  	    self.pop_entity()
  	else:


=== xmlutils.py
***************
*** 116,121 ****
--- 116,122 ----
  	self.last_break=0
  	self.datasize=len(contents)
  	self.last_upd_pos=0
+         self.final=1
  
      def pop_entity(self):
  	"Skips out of the current entity and back to the previous one."

--Lars M.