[Tutor] Beautiful Soup
Peter Otten
__peter__ at web.de
Tue Jan 19 04:00:52 EST 2016
Crusier wrote:
> Hi Python Tutors,
>
> I am currently able to strip down to the string I want. However, I
> have problems with the JSON script and I am not sure how to slice it
> into a dictionary.
>
> import urllib
> import json
> import requests
>
> from bs4 import BeautifulSoup
>
>
> url =
>
'https://bochk.etnet.com.hk/content/bochkweb/eng/quote_transaction_daily_history.php?code=6881\
>
&time=F&timeFrom=090000&timeTo=160000&turnover=S&sessionId=44c99b61679e019666f0570db51ad932&volMin=0&turnoverMin=0'
>
> def web_scraper(url):
>
> response = requests.get(url)
> html = response.content
> soup = BeautifulSoup(html, 'lxml')
>
> stock1 = soup.findAll('script')[4].string
> stock2 = stock1.split()
> stock3 = stock2[3]
> # is stock3 sufficient to process as JSON or need further cleaning??
>
> text = json.dumps(stock3)
> print(text)
>
>
> web_scraper(url)
>
> If it is possible, please give me some pointers. Thank you
- You need json.loads(), not dumps() to convert text into a python data
structure
- It looks like you have to remove a trailing ";" from stock3 for loads() to
succeed
More information about the Tutor
mailing list