How do you htmlentities in Python

Matimus mccredie at gmail.com
Mon Jun 4 19:17:27 CEST 2007


On Jun 4, 6:31 am, "js " <ebgs... at gmail.com> wrote:
> Hi list.
>
> If I'm not mistaken, in python, there's no standard library to convert
> html entities, like &amp; or &gt; into their applicable characters.
>
> htmlentitydefs provides maps that helps this conversion,
> but it's not a function so you have to write your own function
> make use of  htmlentitydefs, probably using regex or something.
>
> To me this seemed odd because python is known as
> 'Batteries Included' language.
>
> So my questions are
> 1. Why doesn't python have/need entity encoding/decoding?
> 2. Is there any idiom to do entity encode/decode in python?
>
> Thank you in advance.

I think this is the standard idiom:

>>> import xml.sax.saxutils as saxutils
>>> saxutils.escape("&")
'&amp;'
>>> saxutils.unescape("&gt;")
'>'
>>> saxutils.unescape("A bunch of text with entities: &amp; &gt; &lt;")
'A bunch of text with entities: & > <'

Notice there is an optional parameter (a dict) that can be used to
define additional entities as well.

Matt




More information about the Python-list mailing list