How to decode UTF strings?
Eli the Bearded
* at eli.users.panix.com
Sun Oct 27 01:36:39 EDT 2019
In comp.lang.python, DFS <nospam at dfs.com> wrote:
> On 10/25/2019 10:57 PM, MRAB wrote:
>> Here's a simple example, based in your code:
>>
>> from email.header import decode_header
>>
>> def test(header, default_encoding='utf-8'):
>> parts = []
>>
>> for data, encoding in decode_header(header):
>> if isinstance(data, str):
>> parts.append(data)
>> else:
>> parts.append(data.decode(encoding or default_encoding))
>>
>> print(''.join(parts))
>>
>> test('=?iso-8859-9?b?T/B1eg==?= <oguz.ismail.uysal at gmail.com>')
>> test('=?utf-8?Q?=EB=AF=B8?= <taeyeon10006 at gmail.com>')
>> test('=?GBK?B?0Pu66A==?= <xuan.alan at 163.com>')
>> test('=?UTF-8?B?zp3Or866zr/PgiDOks6tz4HOs86/z4I=?=
>> <vergos.nikolas at gmail.com>')
> I don't think it's working:
It's close. Just ''.join should be ' '.join.
> $ python decode_utf.py
> O≡uz<oguz.ismail.uysal at gmail.com>
> 미<taeyeon10006 at gmail.com>
> ╨√║Φ<xuan.alan at 163.com>
> Νίκος Βέργος<vergos.nikolas at gmail.com>
Is your terminal UTF-8? I think not.
Elijah
------
answered with C code to do this in comp.lang.c
More information about the Python-list
mailing list