Need help to tree-ify nested parenthesis...

Gregory Nans gregnans at yahoo.PAS.DE.SPAM.fr
Sat Aug 30 17:22:10 EDT 2003


hello,

i need some help to 'tree-ify' a string...

for example i have strings such as :

s = """A(here 's , B(A ) silly test) C(to show D(what kind) of stuff i 
need))""" 

and i need to parse them to have a tree representation such as :

["A", [
	"here 's" , 
	[ "B", [
		'A ) "silly test'
	]],
	["C", [
		"to show",
		["D", [
			"what kind"
		]],
		"of stuff i need",
	]]
]]

notes : "ALLCAPS(something else)" are some kind of tag, "something 
else" may contain ALLCAPS() and thus be nested or contain others 
characters. the parenthesis between 'A' and 'silly test' isn't a 
mistake but a case to handle string is quite well formed = orphan 
parenthesis will always be surrounded by whitespace - like in 'A ) 
silly test' or 'A ( silly test' 

if have started to write a function which give me pos and nested 
string, but still need help to finish to tree-ify it from there... or 
maybe there is a more obvious way ? (i think i didn't need to create 
real finite state automata for this stuff) 

thanks

>>> parse(d)
-1 D what kind 42 @ 51
-1 C to show D(what kind) of stuff i need 32 @ 68
-1 B A ) silly test 14 @ 28
-1 A here 's , B(A ) silly test) C(to show D(what kind) of stuff i 
need) 2 @ 69 


-------------------%<------------------

def parse(s):

	n = len(s)

	p = re.compile(r'([A-Z_]+\()')	
	pos = []
	iterator = p.finditer(s)
	for match in iterator:
		pos.append(match.span())

	j = 0
	pos.reverse()
	for (start, end) in pos:
		opened = 0
		i = end
		while i < n - j:
			if s[i] == '(':
				opened += 1
			if s[i] == ')' and s[i - 1] != ' ':
				opened -= 1
				if opened < 0:
					print opened, s[start:end-1], s[end:i], "%s @ %s" % (end, 
i)
					break
			i += 1





More information about the Python-list mailing list