Re: [Mailman-i18n] [Mailman-Users] Mailman and UTF8
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mark Sapiro schrieb:
Jan Kohnert wrote:
So I found out, I have to encode the german mailman.po file in UTF-8 and then rebuild the *.mo out of it. Now it works, so I can provide this version (to large for this list to attach it). (Mailman 2.1.9_rc1).
I18n issues like the above are better discussed on the mailman-i18n list http://mail.python.org/mailman/listinfo/mailman-i18n.
Agreed, so I'm crossposting this one for References in the I18N list. Followups please on that list.
But there is one (small) thing left: If you look in [1] you will notice one incorrectly displayed character (the ---next part---, in German ---n=E4chster Teil--- does not work in all cases ([1] does not work, [2] does), altough all my editors say, the umlaut is correctly declared...
It looks like in [1] somehow the utf-8 encoded message got interpreted as some other character set (maybe iso-8859-1) and then got encoded again as utf-8 so that instead if the a with umlaut, you see the bytes of the utf-8 encoding of a with umulaut displayed as characters.
This may be a scrubber issue of some kind, but I am not sure why it would occur with only one of two apparently structurally identical messages from the same poster, but here is a clue.
I looked at the text file https://secure.the-pojs.dyndns.org/pipermail/pojs-discussion/2006-November.t.... While there are no Content-Type: headers in that file, I can see the encoding of the Subject: header. It appears that the 'bad' posts are 'original' posts and are iso-8859-1 encoded by the poster's (you) MUA, and the 'good' posts are 'replies' and are utf-8 encoded by the MUA.
Thus it appears that there may be a scrubber issue when the character set of the incoming message is iso-8859-1 but the i18n translated canned messages are utf-8.
What mailman version is this?
Leaving your comment completely in heare for reference; as said above, this is v2.1.9. Regards Jan -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFZZ+3ZRp6KEAo/3oRArw/AJ4qn1yjvFmwvYJbZO3e1bgqTadMrgCgsXDO 2sdRW2NP8u6cwQj4GK21/Gw= =peRZ -----END PGP SIGNATURE-----
Jan Kohnert wrote:
Mark Sapiro schrieb:
Thus it appears that there may be a scrubber issue when the character set of the incoming message is iso-8859-1 but the i18n translated canned messages are utf-8.
What mailman version is this?
Leaving your comment completely in heare for reference; as said above, this is v2.1.9.
Sorry, somehow I overlooked your mention of the version in the OP.
Anyway, this is definitely a scrubber issue. I see why it occurs, but
I'm not yet sure how to fix it. The problem is when the character set
of the translation returned by
_('-------------- next part --------------\n')
is not compatible with the character set of the message being scrubbed,
the translation can be garbled.
I think we should be using the character set of the list's preferred
language rather than the character set of the message in this case,
but the process is complicated and I'm not sure how to do it.
If you want, you can try the attached scrubber.patch.txt (apply to
Mailman/Handlers/Scrubber.py).
--
Mark Sapiro
participants (2)
-
Jan Kohnert
-
Mark Sapiro