Tag objects in Beautiful Soup

Peter Otten __peter__ at web.de
Thu Nov 20 16:02:08 CET 2014


Simon Evans wrote:

> Re:'Accessing the Tag object from Beautiful Soup' (page 22-25 - Getting
> Started with Beautiful Soup) So far the code to python27 runs as given in
> the book, re: -
> 
----------------------------------------------------------------------------
>>>> html_atag = """<html><body><p>Test html a tag example</p>
> ... <a href="http://www.packtpub.com'>Home</a>
> ... <a href="http;//www.packtpub.com/books'.Books</a>
> ... </body>
> ... </html>"""
>>>> soup = BeautifulSoup(html_atag,'lxml')
>>>> atag = soup.a
>>>> print(atag)
> <a href="http://www.packtpub.com'>Home</a>
> <a href=" http="">
> </a>
>>>> type(atag)
> <class 'bs4.element.Tag'>
>>>>
>>>> tagname = atag.name
>>>> print tagname
> a
>>>> atag.name = 'p'
>>>> print (soup)
> <html><body><p>Test html a tag example</p>
> <p href="http://www.packtpub.com'>Home</a>
> <a href=" http="">
> </p></body>
> </html>
> 
----------------------------------------------------------------------------
> then under the next Sub heading : 'Attributes of a Tag object'
> text reads :

There is no assignment 

soup_atag = whatever

but there is one to atag. The whole session should when you omit the 
offending line

> atag = soup_atag.a

or insert

soup_atag = soup

before it.

> print (atag['href'])
> 
> #output
> http://www.packtpub.com
> 
> however when I put this code to the console I get error returns at the
> first line re:-
> 
----------------------------------------------------------------------------
>>>> atag = soup_atag.a
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> NameError: name 'soup_atag' is not defined
>>>>
> 
----------------------------------------------------------------------------
> Can anyone tell me where I am going wrong or where the text is wrong ?
> So far the given code has run okay, I have put to the console everything
> the text tells you to. Thank you for reading.
> Simon Evans





More information about the Python-list mailing list