[Tutor] how to extract text by specifying an element using ElementTree

Kent Johnson kent37 at tds.net
Thu Dec 8 20:43:55 CET 2005


ps python wrote:
> Hi, 
> 
> using ElementTree, how can I extract text of a
> particular element, or a child node. 
> 
> For example:
> 
> <biological_processess>
>    <biological_process>
>            Signal transduction
>    </biological_process>
>    <biological_process>
>            Energy process
>     </biological_process>
> </biological_processess>
> 
> In the case where I already know which element tags
> have the information that I need, in such case how do
> i get that specific text. 

Use find() to get the nodes of interest. The text attribute of the node 
contains the text. For example:

data = '''<biological_processess>
    <biological_process>
            Signal transduction
    </biological_process>
    <biological_process>
            Energy process
     </biological_process>
</biological_processess>
'''

from elementtree import ElementTree

tree = ElementTree.fromstring(data)

for process in tree.findall('biological_process'):
   print process.text.strip()


prints
Signal transduction
Energy process

You will have to modify the path in the findall to match your actual 
data, assuming what you have shown is just a snippet.

I stripped whitespace from the text because otherwise it includes the 
newlines and indents exactly as in the original.

Kent



More information about the Tutor mailing list