[XML-SIG] Character entities (XHTML)

Andrew Cooke andrewc@webtronfinance.com
Tue, 07 May 2002 16:51:57 -0400


Hi,

In the example below I am losing the XHTML character entities.  How d=
o I
avoid this?
I've also posted to the ng - apologies to people seeing this same que=
stion
twice.

Andrew

Input file:
<html>
  <head>
    <link type=3D"text/css" rel=3D"stylesheet" href=3D"basic.css" />
    <title>Index</title>
  </head>
  <body>
  <h1>=A1Hola!</h1>
<a href=3D"init">initialisaci&oacute;n</a>
  </body>
</html>

And when this is processed, I see (note that the SGML entity &lt; doe=
s
appear, but oacute and iexcl don't):
F:\home\Andrew\multi\src\xhtml>python
Python 2.2.1 (#34, Apr  9 2002, 19:34:33) [MSC 32 bit (Intel)] on win=
32
Type "help", "copyright", "credits" or "license" for more information=
.
>>> from xml.dom.ext.reader.Sax2 import FromXmlFile
>>> from xml.dom.ext import PrettyPrint
>>> PrettyPrint(FromXmlFile("index.xhtml"))
<?xml version=3D'1.0' encoding=3D'UTF-8'?>
<!DOCTYPE html>
<html xmlns=3D'http://www.w3.org/1999/xhtml'>
  <head>
    <meta content=3D'HTML Tidy for Cygwin (vers 1st April 2002), see
www.w3.org' n
ame=3D'generator'/>
    <link href=3D'basic.css' rel=3D'stylesheet' type=3D'text/css'/>
    <title>Index</title>
  </head>
  <body>
    <h1>&lt;Hola!</h1>
    <a href=3D'init'>initialisacin</a>
  </body>
</html>
>>>