[Tutor] Beautiful Soup

Crusier crusier at gmail.com
Tue Jan 19 02:53:07 EST 2016


Hi Python Tutors,

I am currently able to strip down to the string I want. However, I
have problems with the JSON script and I am not sure how to slice it
into a dictionary.

import urllib
import json
import requests

from bs4 import BeautifulSoup


url = 'https://bochk.etnet.com.hk/content/bochkweb/eng/quote_transaction_daily_history.php?code=6881\
&time=F&timeFrom=090000&timeTo=160000&turnover=S&sessionId=44c99b61679e019666f0570db51ad932&volMin=0&turnoverMin=0'

def web_scraper(url):

    response = requests.get(url)
    html = response.content
    soup = BeautifulSoup(html, 'lxml')

    stock1 = soup.findAll('script')[4].string
    stock2 = stock1.split()
    stock3 = stock2[3]
    # is stock3 sufficient to process as JSON or need further cleaning??

    text =  json.dumps(stock3)
    print(text)


web_scraper(url)

If it is possible, please give me some pointers. Thank you

Regards,
Henry


More information about the Tutor mailing list