[Pythonmac-SIG] XML Parsing

Bryan Smith bryanabsmith at gmail.com
Sat Feb 7 05:00:44 CET 2009


Hi everyone,

I have another question I'm hoping someone would be kind enough to answer. I
am new to parsing XML (not to mention much of Python itself) and I am trying
to parse an XML file. The file I am trying to parse is this one:
http://ws.audioscrobbler.com/2.0/user/bryansmith/topalbums.xml.

So far, I have written up a class for parsing this file in my attempts to
present to the user a list of top albums on their last.fm profile. If you
note, the artist name and album name are both signified by the <name> tag
which makes my job harder. If the tag names were different, I wouldn't have
a problem. Listed below is the class I have written to parse the file. My
question then is this: is there a way I can say something like "if tag_name
== album name tag then....elif tag_name == artist name tag....". I hope this
is clear.

As it stands right now, if I parse this file and print the results, this is
what I get (understandably) if I try to print out in the following fashion -
album (playcount): Vheissu (332), Thrice (289), The Artist in the Ambulance
(286), Thrice (210) and so on. Thrice is the artist name. I want to be able
to differentiate between the "artist" name tag and the "album" name tag.


Class as it stands right now:

class GetTopAlbums(ContentHandler):

    in_album_tag = False
    in_playcount_tag = False

    def __init__(self, album, playcount):
        ContentHandler.__init__(self)
        self.album = album
        self.playcount = playcount
        self.data = []

    def startElement(self, tag_name, attr):
        if tag_name == "name":
            self.in_album_tag = True
        elif tag_name == "playcount":
            self.in_playcount_tag = True

    def endElement(self, tag_name):
        if tag_name == "name":
            content = "".join(self.data)
            self.data = []
            self.album.append(content)
            self.in_album_tag = False
        elif tag_name == "playcount":
            content = "".join(self.data)
            self.data = []
            self.playcount.append(content)
            self.in_playcount_tag = False

    def characters(self, string):
        if self.in_album_tag == True:
            self.data.append(string)
        elif self.in_playcount_tag == True:
            self.data.append(string)

Thanks in advance!
Bryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pythonmac-sig/attachments/20090206/f1e1dc1b/attachment.htm>


More information about the Pythonmac-SIG mailing list