Mailman 3 BeautifulSoup - Buildbot-status

6 Jul 2020


      For one of our casino websites we're trying the BeautifulSoup module which
can handle HTML and XML. It should provide a simple method for searching,
navigating and modifying the parse tree. However? Can we store and push
this right away?
The example below prints all links on a webpage:
from BeautifulSoup import BeautifulSoup
import urllib2
import re
html_page = urllib2.urlopen("https://www.online-casino-spielautomaten.de")
soup = BeautifulSoup(html_page)
for link in soup.findAll('a', attrs={'href': re.compile("^http://")}):
print link.get('href')
It downloads the raw html code with the line:
html_page = urllib2.urlopen("https://www.online-casino-spielautomaten.de")
A BeautifulSoup object is created and we use this object to find all links:
soup = BeautifulSoup(html_page)
for link in soup.findAll('a', attrs={'href': re.compile("^http://")}):
print link.get('href')

BeautifulSoup

Ash Khalifa

refer sent

tags

participants (2)