Help with using findAll() in BeautifulSoup

Alexnb alexnbryan at
Sat Jul 12 05:46:02 CEST 2008

Okay, I am not sure if there is a better way of doing this than findAll() but
that is how I am doing it right now. I am making an app that screen scapes for definitions. However, I would like to have the type of
the word for each definition. For example if def1 and def2 are noun
defintions but def3 isn't:


Something like that. Now I can get the definitions just fine. But the
problem comes when I want to get the type. I can get the types, but I don't
know for what definitions they go with. So I can get noun and verb, but for
all I know noun is def1, and verb is 2 and 3. I am wondering if there is a
way to use findAll() but like stop once it hits a certain thing, or a way to
do just that. for example, if I have

<table blah>
<table blah>
<table blah>

I want to be able to do like findAll('span', {'class': 'pg'}), but tell me
how many <table> things are after it, or before the next  so I know how many
defintions it has.

 Here is the code I am using(I used "cheese" because that is kinda my test
word for everything in the app.):

import urllib
from BeautifulSoup import BeautifulSoup

class defWord:
    def __init__(self, word):
        self.word = word

        def get_types(term):
            soup =
BeautifulSoup(urllib.urlopen('' %

            for tabs in soup.findAll('span', {'class': 'pg'}):
                yield tabs.contents[0].string

        self.mainList = list(get_types(self.word))
        print self.mainList

type = defWord("cheese")

I don't know if this is really something anyone can help me fix or if I have
to do it on my own. But I would love some help. 
View this message in context:
Sent from the Python - python-list mailing list archive at

More information about the Python-list mailing list