[Tutor] HTML --> TXT?

William Park parkw@better.net
Wed, 29 Mar 2000 15:30:42 -0500


On Wed, Mar 29, 2000 at 10:42:09AM -0800, Deirdre Saoirse wrote:
> On Wed, 29 Mar 2000, William Park wrote:
> 
> > On Wed, Mar 29, 2000 at 09:09:08AM -0600, Curtis Larsen wrote:
> > > Is there a fairly simple Python-ish way to convert an HTML file to text?
> 
> > If you have Linux, then 'lynx -dump ...' will do it.
> 
> There are n ways to skin the cat, but this does not answer the question
> asked.
> 
> If someone asks for a way to do it in *python*, given that this is a tutor
> list, please try and offer a solution in python. Very often, a person is
> trying to grasp concepts of how to do things in python. Furthermore, they
> may be trying to do it in platforms that don't have the other tools called
> for.
> 
> Since someone else has responded to the issue of lynx and availability on
> more platforms than Linux, I won't.

Dear Deirdre,

I usually find that searching for variation on existing solution is good
way to learn, especially for beginner.  Original message wasn't clear
how HTML tags should be removed -- deleting everything between '<' and
'>', formating into ASCII text, or what.

'lynx' and 'w3m' are both text-based browser.  So, I offered my advice.
I shall try get your approval, next time I answer questions posted to
this mailing-list.

	Your friend,
	William