How do you htmlentities in Python
mccredie at gmail.com
Mon Jun 4 19:17:27 CEST 2007
On Jun 4, 6:31 am, "js " <ebgs... at gmail.com> wrote:
> Hi list.
> If I'm not mistaken, in python, there's no standard library to convert
> html entities, like & or > into their applicable characters.
> htmlentitydefs provides maps that helps this conversion,
> but it's not a function so you have to write your own function
> make use of htmlentitydefs, probably using regex or something.
> To me this seemed odd because python is known as
> 'Batteries Included' language.
> So my questions are
> 1. Why doesn't python have/need entity encoding/decoding?
> 2. Is there any idiom to do entity encode/decode in python?
> Thank you in advance.
I think this is the standard idiom:
>>> import xml.sax.saxutils as saxutils
>>> saxutils.unescape("A bunch of text with entities: & > <")
'A bunch of text with entities: & > <'
Notice there is an optional parameter (a dict) that can be used to
define additional entities as well.
More information about the Python-list