How do I convert escaped HTML into a string?
Just Another Victim of the Ambient Morality
ihatespam at hotmail.com
Sat Nov 24 00:42:06 EST 2007
I've done a google search on this but, amazingly, I'm the first guy to
ever need this! Everyone else seems to need the reverse of this. Actually,
I did find some people who complained about this and rolled their own
solution but I refuse to believe that Python doesn't have a built-in
solution to what must be a very common problem.
So, how do I convert HTML to plaintext? Something like this:
<div>This is a string.</div>
...into:
This is a string.
Actually, the ideal would be a function that takes an HTML string and
convert it into a string that the HTML would correspond to. For instance,
converting:
<div>This & that
or the other thing.</div>
...into:
This & that or the other thing.
...since HTML seems to convert any amount and type of whitespace into a
single space (a bizarre design choice if I've ever seen one).
Surely, Python can already do this, right?
Thank you...
More information about the Python-list
mailing list