How do you htmlentities in Python

Cameron Laird claird at lairds.us
Tue Jun 5 20:36:26 CEST 2007


In article <1180977447.745432.109040 at q19g2000prn.googlegroups.com>,
Matimus  <mccredie at gmail.com> wrote:
>On Jun 4, 6:31 am, "js " <ebgs... at gmail.com> wrote:
>> Hi list.
>>
>> If I'm not mistaken, in python, there's no standard library to convert
>> html entities, like &amp; or &gt; into their applicable characters.
>>
>> htmlentitydefs provides maps that helps this conversion,
>> but it's not a function so you have to write your own function
>> make use of  htmlentitydefs, probably using regex or something.
>>
>> To me this seemed odd because python is known as
>> 'Batteries Included' language.
>>
>> So my questions are
>> 1. Why doesn't python have/need entity encoding/decoding?
>> 2. Is there any idiom to do entity encode/decode in python?
>>
>> Thank you in advance.
>
>I think this is the standard idiom:
>
>>>> import xml.sax.saxutils as saxutils
>>>> saxutils.escape("&")
>'&amp;'
>>>> saxutils.unescape("&gt;")
>'>'
>>>> saxutils.unescape("A bunch of text with entities: &amp; &gt; &lt;")
>'A bunch of text with entities: & > <'
>
>Notice there is an optional parameter (a dict) that can be used to
>define additional entities as well.
			.
			.
			.
Good points; I like your mention of the optional entity dictionary.

It's possible that your solution is to a different problem than the original
poster intended.  <URL: http://wiki.python.org/moin/EscapingHtml > has de-
tails about HTML entities vs. XML entities.



More information about the Python-list mailing list