[Tutor] General question rgrd. usage of libraries

Palm Tree timeofsands at gmail.com
Fri May 5 23:12:27 EDT 2017


Hum i also suggest you get more experience with python

think of a project and learn while doing it. thus you'll get motivation
while at the same time doing something useful which you could reuse in the
future.

else,

me for bs4 i googled what i needed. I also put an increasing variable to
ease web scraping tasks. like

var =0
...
print(var, element)

also, i suggest you decode to unicode as you'll get crazy hex stuffs if you
don't

.decode("utf-8")

i scrape websites written in french, so i always need unicode.

else the

.text is very helpful

like 'p' gives you the element

but

'p.text' gives you the content

To find suitable libraries i suggest you become good at doing the desired
task by hand as far as possible, so you'll know your job well. Then you
identify boring, impossible or tiring tasks. Then you google like .. python
module <taskname> or just python how to <taskname> and see how they did it
or what module they used to do it.

Hope it helps,

Abdur-Rahmaan Janhangeer,
Mauritius

On 6 May 2017 00:56, "Jim" <jf_byrnes at comcast.net> wrote:

> On 05/05/2017 08:45 AM, Rafael Knuth wrote:
>
>> Hi there,
>>
>> I just recently learned how to build a basic web scraper with Python
>> 3.5 (I am learning Python for data analytics purposes). Being new to
>> coding, I have a question:
>>
>> How do I know which libraries I need to perform a certain task?
>> For example, in case of this web scraper (which I built with help of a
>> tutorial on YouTube) I need to have urrlib and Beautiful Soup
>>
>> import urllib
>> import urllib.request
>> from bs4 import BeautifulSoup
>>
>> theurl = "https://twitter.com/rafaelknuth"
>> thepage = urllib.request.urlopen(theurl)
>> soup = BeautifulSoup(thepage, "html.parser")
>>
>> print(soup.title.text)
>>
>> i = 1
>> for tweets in soup.findAll("div",{"class":"content"}):
>>     print(i)
>>     print(tweets.find("p").text)
>>     i = i + 1
>>
>> Is there a way I can figure out which libraries I need when drafting my
>> code?
>> Can you share your experiences? Right now, if I wanted for example to
>> populate a Google Sheet with my scraped web content - how would I know
>> which libraries I would need to actually make this happen? I am trying
>> wondering if there is a process to figure out what I exactly need
>> library-wise.
>>
>>
>>
> There is a Python API to google sheets but when I had a look, it seemed
> fairly complex. I haven't tried it yet but depending on what you need to do
> this library may be what you need:
>                   https://pypi.python.org/pypi/gspread.
>
> Regards,  Jim
>
>
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>


More information about the Tutor mailing list