[XML-SIG] [4XSLT] bug report and patch for complex XML and XSL nesting

Olivier CAYROL (Logilab) Olivier.Cayrol@logilab.fr
Thu, 19 Apr 2001 20:44:05 +0200 (CEST)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---1463794431-39797830-987705845=:3976
Content-Type: TEXT/PLAIN; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE


  Hello,

  I found a vicious bug in 4XSLT (hidden very deeply in the code).=20
Attached to this message you will find a tar.gz file containing a=20
directory tree that exhibits the bug. It is a little application for=20
managing Easter rabbits and eggs distribution (!). There is an XML file=20
that contains the data: easter_mng.xml, an XSL Transformation file:=20
xsl/transf.xsl and XML files containing data for localization:=20
lib/common.xml, lib/en.xml, lib/fr.xml.

  The lib/common.xml file is imported in the XSLT stylesheets with the=20
'document()' function and is used to insert language-dependant tags in=20
the output. This common.xml file imports other XML files (one per=20
language) with the classic external ENTITY mechanism of XML.

  When trying to transform the data file from the main directory with=20
the following line command:=20
    4xslt -Dlang=3Den  easter_mng.xml xsl/transf.xsl
, I got this exception:
    ...
      File "/usr/lib/python1.5/site-packages/xml/xslt/XsltFunctions.py",=20
    line 63, in Document
        doc =3D context.stylesheet._docReader.fromUri(uri, baseUri=3DbaseUr=
i)
      File "/usr/lib/python1.5/site-packages/Ft/Lib/ReaderBase.py", line=20
    67, in fromUri
        rt =3D self.fromStream(stream, baseUri, ownerDoc, stripElements)
      File "/usr/lib/python1.5/site-packages/Ft/Lib/pDomlette.py", line 5
    78, in fromStream
        raise FtException(Error.XML_PARSE_ERROR, p.ErrorLineNumber, p.Err
    orColumnNumber, expat.ErrorString(p.ErrorCode))
      Ft.Lib.FtException: ('XML parse error at line 16, column 2: error i
    n processing external entity reference', (16, 2, 'error in processing
     external entity reference'))

  In fact, there is a problem when 4XSLT reads the XML document=20
referenced in the 'document()' function: this XML file contains ENTITYs=20
that import XML tree parts by giving local paths from the current=20
document directory whereas in 4XSLT, the baseUri is always the URI of=20
the initial XSLT. The XML reader is unable to find the external entities
and the bug appears.

  Replacing line 67 of Ft.Lib.ReaderBase.py in DomletteReader.fromUri
function:
    rt =3D self.fromStream(stream, baseUri, ownerDoc, stripElements)
with :
    newBaseUri =3D urllib.basejoin(baseUri, uri)
    rt =3D self.fromStream(stream, newBaseUri, ownerDoc, stripElements)
fixes the bug.

  I initially found the bug while trying to process Norman Walsh's XSLT
stylesheets for turning docbook files in XSL formatting objects files (I
am unfortunately not working for the Easter Rabbit).

  Regards,

    O. CAYROL.
_________________________________________________________________________
Olivier CAYROL                                   LOGILAB - Paris (France)
                                                 http://www.logilab.com/
Change your millenium, try NARVAL the Intelligent Personal Assistant.
Changez de mill=E9naire, essayez NARVAL l'Assistant Personnel Intelligent.
_________________________________________________________________________

---1463794431-39797830-987705845=:3976
Content-Type: APPLICATION/x-gzip; name="easter_mng.tar.gz"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.21.0104192044050.3976@leo.logilab.fr>
Content-Description: bug exhibitor
Content-Disposition: attachment; filename="easter_mng.tar.gz"

H4sIALYw3zoAA+1Y7W7aSBTld55iMlstiVbgD7AdKKakjVVFSposeKWNomhl
7IFYtT3seFxCH3efZGfG2GAokGgD3ao+f7C5d+4dz/ieezzIiSkif4XRuP4U
BpW9QFZkWdebFZnBMOTCL0dT1iuyoeqGLmuGpjJ/VVGMCpD3M50ikpg6BIAK
dmdb/XbZf1B03rFtB18QiX0cmVCpyxCgyMWeH41NeDm4qZ2daa2aAt91j446
bK1Go+4RAB3iDIc+BZETIhO+T8YxeJ9E0QxyIzOj8Ri4OMDEhAR5sKs0OhL7
b906DBIEu00tN3ekNPR6lr642ZShpW/NcDbPAAopOlL6RN97F74fAn8o7TsH
r3DD0DbWP8O8/htyQ1MEXyiNCtD2PTGOn7z++f6jaH/cz7GL/xuc88X+M0eD
87+iNNSS/w+BTuBE48QZI+B7JkSRoNfOFBMPfEYzE1KfBgiCLw5jUROqrDCB
JSQDsMaM9C/8mBJ/mFDWPUAfTTChUFoJkZJtHmNO46tejJxzFx56PQxj+cyB
X6/aBc9nDuJG4gSfPd9PzPHbwOvfxWGI98gBu+pf55yf6r+mrOpc/8lqs6z/
Q+Bl+u/44uaDfXdrAc93eclHDpmB+1RyHVtX1rX1yS7YTrLyO/2tu+KWE88J
r+GF/dy2ry4HS3bfAx8uzu1z8Evf+v2Py751sRpKkIB1fWvfrQbJ6GFThIKf
4I51zyzZJ/vSvmOLU+MPCAZ3A9u6BjBtnrBb9BqRoteIzL0eAF/HpSXiA3+d
B33Lr+dD3zLuWnbb0/7z+k8nt6cElWf0fyPr/7LRVDXR/xkllPV/ABT7/4hs
7/99Z8JbPBA6IE4IK9KiBPCqNygZxcBD4PafvxP0jTZeVANXzsSPtooBEXC7
GsDJGO3SAygp9cA38BQH/6/vP10WfKGV338HAd9/SpwoHtXZ5X5y7OB/tWkY
Of9rutB/itoo+f8QeJn+Y69IO6azAMWPCNHiMCGAlsECR3GbDTHhI6WTtiRN
p9P6tFHHZCwprVZL+nNwJdni5cMkhEJpiRQ4oZOEghDRR8y6EkVP4nuRm49r
NfAxwEMnABOHOMyFfYq6jxjHbMILzVirdbNgwm1+hMjtEMQoQC41YRVF1UJc
RCmPsixg2cxAgF0n8L86osNtiMzHLCJ72E1CFNGTar0uFT+xqqfLsm45fR9j
CigKJ4FDURtMCHZRHKMY4CiYgewoFHvsn6VJZANA6FD30YRSJkVT45M45Vxc
5ybRGGt4lM/5DZ9W3h7ve0wPvOF3DxLvqfc93lSrQg5UH6SeGJ723N25mGoI
ZrVspnGeUhy+zg9j5/1bWn6mxRux+pSkcBacp1wEeIVnTXNsedj2M7P1+Cvy
X9eKS6IXLREfsL4+r7pCLMWW5QEnr5fKTQjh9XQq9cSx/uasp+C521LfuSVr
S53+sWDAUkSWKFGiRIkSJUqUKFHix8G/6s/VQQAoAAA=
---1463794431-39797830-987705845=:3976--