<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2600.0" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Looking at the methods of the DOM Node (copy/paste
following) in the python doc.</FONT></DIV>
<DIV>
<DL>
<DT><B><A name=l2h-2933><TT class=method>normalize</TT></A></B>()
<DD>Join adjacent text nodes so that all stretches of text are stored as
single <TT class=class>Text</TT> instances. This simplifies processing text
from a DOM tree for many applications. <SPAN class=versionnote>New in version
2.1.</SPAN> </DD></DL></DIV>
<DIV><FONT face=Arial size=2>A DOM Document inherits from a Node
so...</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>HTH</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><FONT face=Arial size=2>--Gilles</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT> </DIV>
<DIV><<A
href="mailto:hawkeye.parker@autodesk.com">hawkeye.parker@autodesk.com</A>> a
écrit dans le message de news: <A
href="mailto:mailman.1042760131.22057.python-list@python.org">mailman.1042760131.22057.python-list@python.org</A>...</DIV>
<BLOCKQUOTE dir=ltr
style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV><SPAN class=852071323-16012003><FONT face=Arial size=2>i'm running into
an odd issue parsing large xml files. it appears that minidom is
arbitrarily splitting some TEXT_NODEs into pieces. for example, the file
in question contains a number of these tags:</FONT></SPAN></DIV>
<DIV><SPAN class=852071323-16012003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=852071323-16012003><FONT face=Arial
size=2><C:Footer>This space provided for legal clarification of contract
issues as defined by the project participants prior to project initiation. The
content herein is determined withing the General Tab of the Log Properties
dialogue box</C:Footer></FONT></SPAN></DIV>
<DIV><SPAN class=852071323-16012003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=852071323-16012003><FONT face=Arial size=2>the parser
correctly parses the C:Footer tag into a dom element, but for some reason
*periodically* splits the child node into *two* text nodes. i can find
no ryhme or reason to the splitting, though it is consistent for a given file;
i.e., it always splits the same nodes in the same place.</FONT></SPAN></DIV>
<DIV><SPAN class=852071323-16012003><FONT face=Arial
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=852071323-16012003><FONT face=Arial size=2>has anyone else
run across this issue? can you explain
it?</FONT></SPAN></DIV></BLOCKQUOTE></BODY></HTML>