[Tutor] Beautiful Soup

Tue Jan 19 04:00:52 EST 2016

Crusier wrote:

> Hi Python Tutors,
> 
> I am currently able to strip down to the string I want. However, I
> have problems with the JSON script and I am not sure how to slice it
> into a dictionary.
> 
> import urllib
> import json
> import requests
> 
> from bs4 import BeautifulSoup
> 
> 
> url =
> 
'https://bochk.etnet.com.hk/content/bochkweb/eng/quote_transaction_daily_history.php?code=6881\
> 
&time=F&timeFrom=090000&timeTo=160000&turnover=S&sessionId=44c99b61679e019666f0570db51ad932&volMin=0&turnoverMin=0'
> 
> def web_scraper(url):
> 
>     response = requests.get(url)
>     html = response.content
>     soup = BeautifulSoup(html, 'lxml')
> 
>     stock1 = soup.findAll('script')[4].string
>     stock2 = stock1.split()
>     stock3 = stock2[3]
>     # is stock3 sufficient to process as JSON or need further cleaning??
> 
>     text =  json.dumps(stock3)
>     print(text)
> 
> 
> web_scraper(url)
> 
> If it is possible, please give me some pointers. Thank you

- You need json.loads(), not dumps() to convert text into a python data
  structure
- It looks like you have to remove a trailing ";" from stock3 for loads() to
  succeed