How to read between xml tags?

Anthony Liu antonyliu2002 at yahoo.com
Wed Mar 10 17:06:41 EST 2004


Yes, Miki, your code works great to strip the XML tags
and return a clean text file.

But the thing is, I want to process the part between
tags each time it is read in.

For example if I have a tagged XML doc like so:

<tag1>Something here</tag1>
<tag2>something else here</tag2>

I want to get "Something here" in one read operation
and process it before I move on to get "Something else
here".

So any way to go about this?



--- Miki Tebeka <miki.tebeka at zoran.com> wrote:
> Hello Anthony,
> 
> > 1. The read operation must either read a full tag
> or
> > ignore the tag.
> > 
> > 2. If the read operation reads between <P> and
> </P>,
> > then it must reads the whole thing between those 2
> > tags all at once.
> > 
> > How can I achieve this please?
> I think the xml.sax module is what you're looking
> for.
> A small, briefly tested something might be:
> ---
> #!/usr/bin/env python
> 
> from xml.sax.handler import ContentHandler
> from xml.sax import parse
> 
> class ArticleHandler(ContentHandler):
>      def __init__(self, *ignore):
>          ContentHandler.__init__(self)
>          self.data = "" # Data buffer
>          self.get = 0 # Get flag
>          # Ignore hash
>          self.ignore = {}.fromkeys([i.lower() for i
> in ignore])
> 
>      def startElement(self, name, attrs):
>          if name.lower() in self.ignore:
>              self.get = 0
>          else:
>              self.get = 1
> 
>      def endElement(self, name):
>          self.get = 0
> 
>      def characters(self, content):
>          if self.get:
>              self.data += content
> 
> 
> from sys import argv
> handler = ArticleHandler()
> parse(argv[1], handler)
> print handler.data # Will print full data
> 
> handler = ArticleHandler("headline")
> parse(argv[1], handler)
> print handler.data # Will print data without
> headlines
> ---
> 
> HTH.
> Miki
> -- 
> http://mail.python.org/mailman/listinfo/python-list


__________________________________
Do you Yahoo!?
Yahoo! Search - Find what you’re looking for faster
http://search.yahoo.com




More information about the Python-list mailing list