[Tutor] Unable to download <th>, <td> using Beautifulsoup
Crusier
crusier at gmail.com
Fri Jul 29 03:28:25 EDT 2016
I am using Python 3 on Windows 7.
However, I am unable to download some of the data listed in the web
site as follows:
http://data.tsci.com.cn/stock/00939/STK_Broker.htm
453.IMC 98.28M 18.44M 4.32 5.33 1499.Optiver 70.91M 13.29M 3.12 5.34
7387.花旗环球 52.72M 9.84M 2.32 5.36
When I use Google Chrome and use 'View Page Source', the data does not
show up at all. However, when I use 'Inspect', I can able to read the
data.
'<th>1453.IMC</th>'
'<td>98.28M</td>'
'<td>18.44M</td>'
'<td>4.32</td>'
'<td>5.33</td>'
'<th>1499.Optiver </th>'
'<td> 70.91M</td>'
'<td>13.29M </td>'
'<td>3.12</td>'
'<td>5.34</td>'
Please kindly explain to me if the data is hide in CSS Style sheet or
is there any way to retrieve the data listed.
Thank you
Regards, Crusier
from bs4 import BeautifulSoup
import urllib
import requests
stock_code = ('00939', '0001')
def web_scraper(stock_code):
broker_url = 'http://data.tsci.com.cn/stock/'
end_url = '/STK_Broker.htm'
for code in stock_code:
new_url = broker_url + code + end_url
response = requests.get(new_url)
html = response.content
soup = BeautifulSoup(html, "html.parser")
Buylist = soup.find_all('div', id ="BuyingSeats")
Selllist = soup.find_all('div', id ="SellSeats")
print(Buylist)
print(Selllist)
web_scraper(stock_code)
More information about the Tutor
mailing list