[Tutor] beautiful soup raw text workarounds?

nathan Smith nathan-tech at hotmail.com
Tue Aug 24 16:06:42 EDT 2021


Hi List,


I'm using beautiful soup to pass a website which is all going well.

I'm having problems though with getting it to include the raw text, that 
is to say, text not in any tag.

I've done some Googling on this and it seems beautiful soup does not 
support the text outside of tags? Fair enough!

I was wondering how I could work around this issue?

For instance, is there like, tag.endpos next_tag.startpos so I could do 
raw-text=text[endpos:nextpos]


I've included the web page below for reference so you can see what I 
mean. the thing I am stuck on is below h2.


Nathan '


Website:

<html>

<head>

<title>This is my website</title>

</head>

<body>

<h1>Headings</h1>

<p>Paragraphs and such.</p>

<h2>Another heading.</h2>

This text here doesn't <br/>

want to show in bs.

</body>

</html>



More information about the Tutor mailing list