[Tutor] mxTextTools help: find the deepest in nest lists
Danny Yoo
dyoo at hkn.eecs.berkeley.edu
Wed Dec 10 14:28:55 EST 2003
On Tue, 9 Dec 2003 pan at uchicago.edu wrote:
> I am learning how to use mxTextTools. I need help with
> the following ...
>
> I wanna match the deepest lists in a nested list tree.
> For example:
>
> The [2, 2a] and [3, 3b] in :
>
> [[[1,[2, 2a]],[[3,3b],4]],6]
>
> The [a.1,b_2] and [d.3,e] in :
>
> [[a.1,b_2],[[c,[d.3,e]],f]]
>
> The [a1,a2], [d.1, d.2] and [o3,o4] in :
>
> [[[[[a1,a2],out],[d.1, d.2]],o2],[[o3,o4],o5]]
>
> How can I setup a tagtable for matching them ??
Hi Pan,
I'm not familiar enough with mxTextTools's tag table stuff. Just to make
sure though: are you sure that a text-handling approach is best for this
problem? Is your input a originally a data structure, or a string?
One way to approach this with regular expressions might be something like
this:
###
>>> import re
>>> def find_innermost_nodes(tree_string):
... regex = re.compile(r"""\[ ## literal open brace
... [^\[]+ ## one or more non-brace
... ## characters
... \] ## literal close brace
... """, re.VERBOSE)
... return regex.findall(tree_string)
...
>>> find_innermost_nodes("[[[1,[2, 2a]],[[3,3b],4]],6]")
['[2, 2a]]', '[3,3b],4]],6]']
###
Ooops! Forgot to make the search nongreedy:
###
>>> def find_innermost_nodes(tree_string):
... regex = re.compile(r"""\[ ## literal open brace
... [^\[]+? ## one or more non-brace
... ## characters, nongreedily
... \] ## literal close brace
... """, re.VERBOSE)
... return regex.findall(tree_string)
...
>>> find_innermost_nodes("[[[1,[2, 2a]],[[3,3b],4]],6]")
['[2, 2a]', '[3,3b]']
###
That's better. *grin* Aren't the tag tables similar to regular
expressions?
That being said, I think this approach is approaching the problem in a way
that doesn't generalize well. If we had some tree structure, I'd approach
this with something like:
###
def is_innermost_node(node):
if is_leaf(node): return false
return is_leaf(left(node)) and is_leaf(right(node))
def find_innermost_nodes(node):
if is_leaf(node):
return []
if is_innermost_node(node):
return [node]
return (find_innermost_nodes(left(node)) +
find_innermost_nodes(right(node)))
###
assuming that we have some functions for seeing if a node is a leaf
(is_leaf()), and well as ways of getting the left() and right() children
of a tree node.
Good luck to you!
More information about the Tutor
mailing list