<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">


<HTML><HEAD>


<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">


<META content="MSHTML 6.00.2600.0" name=GENERATOR>


<STYLE></STYLE>


</HEAD>


<BODY bgColor=#ffffff>


<DIV><FONT face=Arial size=2>Looking at the methods of the DOM Node (copy/paste 


following) in the python doc.</FONT></DIV>


<DIV>


<DL>


  <DT><B><A name=l2h-2933><TT class=method>normalize</TT></A></B>() 


  <DD>Join adjacent text nodes so that all stretches of text are stored as 


  single <TT class=class>Text</TT> instances. This simplifies processing text 


  from a DOM tree for many applications. <SPAN class=versionnote>New in version 


  2.1.</SPAN> </DD></DL></DIV>


<DIV><FONT face=Arial size=2>A DOM Document inherits from a Node 


so...</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT> </DIV>


<DIV><FONT face=Arial size=2>HTH</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT> </DIV>


<DIV><FONT face=Arial size=2>--Gilles</FONT></DIV>


<DIV><FONT face=Arial size=2></FONT> </DIV>


<DIV><<A 


href="mailto:hawkeye.parker@autodesk.com">hawkeye.parker@autodesk.com</A>> a 


écrit dans le message de news: <A 


href="mailto:mailman.1042760131.22057.python-list@python.org">mailman.1042760131.22057.python-list@python.org</A>...</DIV>


<BLOCKQUOTE dir=ltr 


style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">


  <DIV><SPAN class=852071323-16012003><FONT face=Arial size=2>i'm running into 


  an odd issue parsing large xml files.  it appears that minidom is 


  arbitrarily splitting some TEXT_NODEs into pieces.  for example, the file 


  in question contains a number of these tags:</FONT></SPAN></DIV>


  <DIV><SPAN class=852071323-16012003><FONT face=Arial 


  size=2></FONT></SPAN> </DIV>


  <DIV><SPAN class=852071323-16012003><FONT face=Arial 


  size=2><C:Footer>This space provided for legal clarification of contract 


  issues as defined by the project participants prior to project initiation. The 


  content herein is determined withing the General Tab of the Log Properties 


  dialogue box</C:Footer></FONT></SPAN></DIV>


  <DIV><SPAN class=852071323-16012003><FONT face=Arial 


  size=2></FONT></SPAN> </DIV>


  <DIV><SPAN class=852071323-16012003><FONT face=Arial size=2>the parser 


  correctly parses the C:Footer tag into a dom element, but for some reason 


  *periodically* splits the child node into *two* text nodes.  i can find 


  no ryhme or reason to the splitting, though it is consistent for a given file; 


  i.e., it always splits the same nodes in the same place.</FONT></SPAN></DIV>


  <DIV><SPAN class=852071323-16012003><FONT face=Arial 


  size=2></FONT></SPAN> </DIV>


  <DIV><SPAN class=852071323-16012003><FONT face=Arial size=2>has anyone else 


  run across this issue?  can you explain 


it?</FONT></SPAN></DIV></BLOCKQUOTE></BODY></HTML>