Translation table to map Latin-1 to ASCII?

John Machin sjmachin at lexicon.net
Sun Jan 26 02:18:12 EST 2003


Rene Pijlman <reageer.in at de.nieuwsgroep> wrote in message news:<j8563v4r381meidikj14v1tjsobcfvijra at 4ax.com>...
> Can anyone point me to a translation table for string.translate
> to map Latin-1 (ISO 8859-1) to ASCII such that \"e maps to e
> etc.?

The translation table would depend on what you want to use it for; you
are smashing 256 different characters into 128, so you have a choice
of losing information or emitting at least two characters per input
character. If you are conflating similar characters so that you can do
some sort of approximate matching, you might as well upshift at the
same time i.e. e-diaeresis would map into E. On the other hand if you
want the results to appear mildly presentable in the land of origin,
you would need to find out the conventions that exist in various
countries for representing accented characters in ASCII (e.g. in
Germany (so I understand) v-diaeresis becomes "ve" for some vowels v,
and the sharp-s (Eszett) becomes "ss") and hope that different
countries don't have a different mapping for the same character.

What do you want to do with all the non-alphabetic characters?

I trust this isn't a Python specific question i.e. if someone gave you
a translation table to be used in C or some other language, you'd be
able to Pythonise it. In which case, instead of posting in
comp.lang.python, you might want to ask in an
internationalisation-related newsgroup, once you've refined your
question.




More information about the Python-list mailing list