How to find <tag> to </tag> HTML strings and 'save' them?

Mark Crowther mark at agtechnical.co.uk
Mon Mar 26 02:23:54 CEST 2007


Yep, I agree! once I've got this done I'll be back to trawling the
tutorials.
Life never gives you the convenience of learning something fully
before having to apply what you have learnt ;]

Thanks for the feedback and links, I'll be sure to check those out.

Mark.

On Mar 26, 12:05 am, "Gabriel Genellina" <gagsl-... at yahoo.com.ar>
wrote:
> En Sun, 25 Mar 2007 19:44:17 -0300, <m... at agtechnical.co.uk> escribió:
>
>
>
>
>
> > from BeautifulSoup import BeautifulSoup
> > import re
>
> > page = open("soup_test/tomatoandcream.html", 'r')
> > soup = BeautifulSoup(page)
>
> > myTagSearch = str(soup.findAll('h2'))
>
> > myFile = open('Soup_Results.html', 'w')
> > myFile.write(myTagSearch)
> > myFile.close()
>
> > del myTagSearch
> > ...............................
>
> > Firstly, I'm getting the following character: "[" at the start, "]" at
> > the end of the code. Along with "," in between each tag line listing.
> > This seems like normal behaviour but I can't find the way to strip
> > them out.
>
> findAll() returns a list. You convert the list to its string  
> representation, using str(...), and that's the way lists look like: with  
> [] around, and commas separating elements. If you don't like that, don't  
> use str(some_list).
> Do you like an item by line? Use "\n".join(myTagSearch) (remember to strip  
> the str() around findAll)
> Do you like comma separated items? Use ",".join(myTagSearch)
> Read about lists herehttp://docs.python.org/lib/typesseq.htmland strings  
> herehttp://docs.python.org/lib/string-methods.html
>
> For the remaining questions, I strongly suggest reading the Python  
> Tutorial (or any other book like Dive into Python). You should grasp some  
> basic knowledge of the language at least, before trying to use other tools  
> like BeautifulSoup; it's too much for a single step.
>
> --
> Gabriel Genellina- Hide quoted text -
>
> - Show quoted text -





More information about the Python-list mailing list