utf - string translation

Frederic Rentsch anthra.norell at vtxmail.ch
Wed Nov 29 06:05:36 EST 2006


Dan wrote:
> On 22 nov, 22:59, "John Machin" <sjmac... at lexicon.net> wrote:
>
>   
>>> processes (Vigenère)
>>>       
>> So why do you want to strip off accents? The history of communication
>> has several examples of significant difference in meaning caused by
>> minute differences in punctuation or accents including one of which you
>> may have heard: a will that could be read (in part) as either "a chacun
>> d'eux million francs" or "a chacun deux million francs" with the
>> remainder to a 3rd party.
>>
>>     
> of course.
> My purpose is not doing something realistic on a cryptographic view.
> It's for learning rudiments of programming.
> In fact, coding characters is a kind of cryptography I mean, sometimes,
> when friends can't read an email because of the characters used...
>
> I wanted to strip off accents because I use the frequences of the
> charactacters. If  I only have 26 char, it's more easy to analyse (the
> text can be shorter for example)
>
>   
Try this:

from_characters   = 
'\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xff\xe7\xe8\xe9\xea\xeb'
to_characters     = 
'AAAAAAACEEEEIIIIDNOOOOOOUUUUYaaaaaaaiiiionoooooouuuuyyceeee'
translation_table = string.maketrans (from_characters, to_characters)
translated_string = string.translate (original_string, translation_table)


Frederic





More information about the Python-list mailing list