Html entities

Fredrik Lundh fredrik at
Wed Mar 21 16:29:36 CET 2001

Syver Enstad wrote:
> Is there an easy way to convert ISO Latin-1 characters that are above 127
> ascii to their HTML, XML entity form?

something like this might work:

# from (the eff-bot guide to) the standard python library

import htmlentitydefs
import re, string

# this pattern matches substrings of reserved and non-ASCII characters
pattern = re.compile(r"[&<>\"\x80-\xff]+")

# create character map
entity_map = {}

for i in range(256):
    entity_map[chr(i)] = "&%d;" % i

for entity, char in htmlentitydefs.entitydefs.items():
    if entity_map.has_key(char):
        entity_map[char] = "&%s;" % entity

def escape_entity(m, get=entity_map.get):
    return string.join(map(get,, "")

def escape(string):
    return pattern.sub(escape_entity, string)

print escape("<spam&eggs>")
print escape("å i åa ä e ö")

## prints:
## <spam&eggs>
## å i åa ä e ö

Cheers /F

<!-- (the eff-bot guide to) the standard python library:

More information about the Python-list mailing list