Re: Extract the “Matrix form” dataset from BCS website.
Thomas Passin
list1 at tompassin.net
Thu Dec 22 12:34:05 EST 2022
On 12/22/2022 8:35 AM, hongy... at gmail.com wrote:
> I want to extract / scrape the “Matrix form” dataset from the BCS website [1], a.k.a., the data appeared in the 3rd column.
>
> I tried with the following python code snippet, but still failed to figure out the trick:
Tell what you observed, and what you expected. For example, does the
data get downloaded? Do you get error messages, and if so what are
they? Does the id variable contain anything at all? Etc.
> import requests
> from bs4 import BeautifulSoup
> import re
>
> proxies = {
> 'http': 'socks5h://127.0.0.1:18888',
> 'https': 'socks5h://127.0.0.1:18888'
> }
>
> requests.packages.urllib3.disable_warnings()
> r = requests.get('https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane', proxies=proxies, verify=False)
> soup = BeautifulSoup(r.content, features="lxml")
>
> table = soup.find('table')
> id = table.find_all('id')
>
> My python environment is as follows:
>
> werner at X10DAi:~$ pyenv shell datasci
> (datasci) werner at X10DAi:~$ python --version
> Python 3.11.1
>
> Any tips will be appreciated.
>
> [1] https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane
>
> Regards,
> Zhao
More information about the Python-list
mailing list