Looking for a lib fun
Michael P. Reilly
arcege at shore.net
Tue Jun 1 12:54:41 EDT 1999
Boris Borcic <zorro at zipzap.ch> wrote:
: There is a function I can't seem to find in the
: standard library. One that will turn XML/HTML
: offending characters to corresponding entity- or
: character references.
: Is it indeed in the lib ? Where ? If not, could a kind
: soul hand me one ? Writing it really feels like
: reinventing the wheel for the 10^nth time.
: TIA
Python 1.5 has the htmlentitydef module which you can use:
import htmlentitydefs
charmap = {}
for key, val in htmlentitydefs.entitydefs.items():
charmap[val] = key
At this point, charmap will contain character to entity references.
charmap['>'] == 'gt'
charmap[chr(160)] == 'nbsp'
etc.
I don't know if anyone has a more "complete" set tho.
-Arcege
More information about the Python-list
mailing list