[Tutor] trouble with objects, instances, type, and None in BeautifulSoup (and in general?)

Clay Wiedemann clay.wiedemann at gmail.com
Wed Mar 7 15:59:21 CET 2007


Still learning, please bear with me if my lingo is a little off. But I
think I have a better handle on my problem.

My objective:
1. get a starting point on a web page, walk down the page until
hitting an HR tag
2. Along the way, test for certain markers that allow me to get
various strings and compile them. For example, the name of a speaker
always appear within a B tag. Please don't help me with this one . . .
(yet)

My approach:
- get the series of starting points on a page then use a "for in" loop
- within that loop (and here is where the trouble occurs) look for an
HR in the .name of the current node. at that point go to the next
node.

My trouble:
- the .name and .string methods trip me up:
--- .name can flunk when a Soup returns a NavigableString
--- .string can return None

I've tried "do while" and even recursion and various conditionals but
keep messing up. So if anyone can show me what is wrong with my code
and/or my approach, that would be great. Maybe a simple type
conversion is needed somewhere?
Would love help with this part and then try objective #2 on my own.

Here's some code showing the recursion + ugly conditionals attempt:

- - - - - - -

def findName(start_point):
	"""
	unnecessary use of recursion? perhaps.
	moves down HTML try returning a name
	only when it exists.
	written to avoid NavigableObject.
	"""
	
	print "------- running findName -------"
	if start_point.name:
		if start_point.name == "None":
			print "You got None, baby!"
			nextNode = start_point.next
			print nextNode
			findName(nextNode)
		else:
			print "got a name?"
			return start_point.name
	else:
		print "not a name"
		print "going to next node"
		nextNode = start_point.next
		findName(nextNode)


quotations = quotepage.findAll('a', attrs = {'name' : re.compile("^qt")})


for q in quotations:
	"""
	testing for .next since current position has a name
	I need a failure to challenge the function
	"""
	position = q.next
	my_nextname = findName(position)
	print my_nextname

- - - - - - -

Thanks for any help!
- Clay


More information about the Tutor mailing list