XML beautifier?
Andreas Jung
ajung at sz-sb.de
Thu Sep 2 13:12:23 EDT 1999
On Thu, Sep 02, 1999 at 05:29:40PM +0200, Alexander Staubo wrote:
> I'm having a gas with the XML package and its DOM classes, but its toxml
> () mechanism outputs mainly flat XML -- no visual structure in the form
> of line shifts or indentation. Is there a Python module that such
> beautification reasonably hassle-free?
Here is just a very stupid program which does the job. It works
with regular expressions. You can although use the sgmllib
to parse the file, find the tags with the unknown_starttag() and
unknown_endtag() functions and indent the output corresponding.
Cheers,
Andreas
------------
import os,sys,re,string
import gzip
fname = sys.argv[1]
if fname[-2:] == 'gz':
data = gzip.GzipFile(fname,'r').read()
else:
data = open(fname,'r').read()
fields = re.split('(<.*?>)',data)
level = 0
for f in fields:
if string.strip(f)=='': continue
if f[0]=='<' and f[1] != '/':
print ' '*(level*4) + f
level = level + 1
elif f[:2]=='</':
level = level - 1
print ' '*(level*4) + f
else:
print ' '*(level*4) + f
More information about the Python-list
mailing list