converting text and spans to an ElementTree
Neil Cerutti
horpner at yahoo.com
Thu May 24 15:52:29 EDT 2007
On 2007-05-24, Neil Cerutti <horpner at yahoo.com> wrote:
> On 2007-05-23, Steven Bethard <steven.bethard at gmail.com> wrote:
> You mean... I left out the hard part? Shucks. I had really
> hoped it didn't matter.
>
>> * the recursive (or stack) part assigns children to parents
>> * the non-recursive part assigns text or tail to the previous element
>> (note that's previous in a sequential sense, not a recursive sense)
>>
>> I'm sure I could implement this recursively, passing around
>> annother appropriate argument, but it wasn't obvious to me
>> that the code would be any cleaner.
>
> Moreover, it looks like you have experience in writing that
> sort of code. I'd have never even attempted it without
> recursion, but that's merely exposing one of my limitations. ;)
You'll be happy to know I found a way to salvage my simple
recursive solution, and make it generate an ElementTree!
def get_tree(text, spans):
"""
>>> text = 'aaa aaa aaabbb bbbaaa'
>>> spans = [
... (etree.Element('a'), 0, 21),
... (etree.Element('b'), 11, 18),
... (etree.Element('c'), 18, 18),
... ]
I'd like to produce the corresponding ElementTree. So I want to write a
get_tree() function that works like::
>>> etree.tostring(get_tree(text, spans))
'<a>aaa aaa aaa<b>bbb bbb<c /></b>aaa</a>'
"""
def helper(text, spans):
if not spans:
return ''
else:
head, tail = spans[0], spans[1:]
elem, start, end = head
if tail:
_, follow_start, follow_end = tail[0]
else:
follow_start, follow_end = (end, end)
if end > start:
return ("<%s>%s%s%s</%s>" %
(elem.tag,
text[start:follow_start],
helper(text, tail),
text[follow_end:end],
elem.tag))
else:
return "<%s />%s" % (elem.tag, helper(text, tail))
return etree.XML(helper(text, spans))
But at least I learned just a *little* about XML and Python
during this arduous process. ;)
--
Neil Cerutti
The concert held in Fellowship Hall was a great success. Special thanks are
due to the minister's daughter, who labored the whole evening at the piano,
which as usual fell upon her. --Church Bulletin Blooper
More information about the Python-list
mailing list