[Tutor] (no subject)
Juan C.
juan0christian at gmail.com
Sun Dec 11 15:21:02 EST 2016
On Dec 10, 2016 12:15 PM, "Tetteh, Isaac - SDSU Student"
> isaac.tetteh at jacks.sdstate.edu> wrote:
> >
> > Hello,
> >
> > I am trying to find the number of times a word occurs on a webpage so I
> used bs4 code below
> >
> > Let assume html contains the "html code"
> > soup = BeautifulSoup(html, "html.pa<http://html.pa>rser")
> > print(len(soup.fi<http://soup.fi
> >nd_all(string=["Engineering","engineering"])))
> > But the result is different from when i use control + f on my keyboard
to
> find
> >
> > Please help me understand why it's different results. Thanks
> > I am using Python 3.5
> >
Well, depending on the word you're looking for it's pretty possible that
when you execute your code it finds matches inside javascript functions,
html/js comments and so on because you're doing a search against the actual
html file. If you execute a simple CRTL+F using a web browser it will just
look for "visual info" and won't be looking into the actual code. For
example, if we go to https://www.python.org/psf/ and do a CRTL+F and search
for "Upgrade to a different browser" we will find zero results, on the
other hand if we do this inside the view-source we will find one result,
because this sentence is inside a commented line.
More information about the Tutor
mailing list