[ mailman-Bugs-1658920 ] charset proble,

SourceForge.net noreply at sourceforge.net
Tue Apr 3 18:43:52 CEST 2007


Bugs item #1658920, was opened at 2007-02-13 05:06
Message generated for change (Comment added) made by msapiro
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1658920&group_id=103

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Web/CGI
Group: 2.1 (stable)
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: dudeua (dudeua)
Assigned to: Mark Sapiro (msapiro)
Summary: charset proble,

Initial Comment:
Hello, I need help.
I have mailman 2.1.9 .... if I  receive mail in charset, different
that koi8, than mailman in admins area shows this mail in
"quoted-printable" charset ...
But after approving this email,  via web-interface email shows with
correct charset.
How to fix charset?

----------------------------------------------------------------------

>Comment By: Mark Sapiro (msapiro)
Date: 2007-04-03 09:43

Message:
Logged In: YES 
user_id=1123998
Originator: NO

>mailman admin page shows in KOI-8 charser ....
>thouse message body with this headers
>Content-Type: text/plain; charset=windows-1251
>Content-Transfer-Encoding: 8bit
>
>shows ugly, with wrong charset.
>When I change page charset from koi-8 to windows cp1251, mailman admin
>page is ugly, but message is OK .....

This is exactly what I would expect to happen in all cases. I suggested
this would occur when I wrote on 2007-02-13 10:53:

>It would be possible to patch
>Mailman/Cgi/admindb.py with the attached patch to decode this "message
>excerpt" before displaying it, but then you would have the issue that
the
>characters in the box in your screenshot would be Windows CP1251
characters
>which would probably still be garbled when displayed in the character
set
>of the rest of the page.

The only question is why doesn't this occur when the message is

Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable

since all the patch does is convert the quoted-printable encoding back to
the original 8bit.

At some point in the future, we expect that mailman will represent
everything internally in Unicode. At that time, this issue may be fixed.
Until then I don't anticipate this behavior to change.

The only advice I can offer is that in cases where it is important to see
the message body correctly in order to decide what action to take on a
particular held message, you either switch character sets in your browser
when you need to see the message body, or you refer to the original message
attached to the 'held message notice' email sent to the moderators which
should display properly in your email client.


----------------------------------------------------------------------

Comment By: dudeua (dudeua)
Date: 2007-04-03 02:07

Message:
Logged In: YES 
user_id=1718234
Originator: YES

mailman admin page shows in KOI-8 charser ....
thouse message body with this headers
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: 8bit

shows ugly, with wrong charset.
When I change page charset from koi-8 to windows cp1251, mailman admin
page is ugly, but message is OK .....


Please note, others messages  body with headers 
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
are shown OK in mailman admin page with  charset koi-8

Thanks



----------------------------------------------------------------------

Comment By: Mark Sapiro (msapiro)
Date: 2007-04-02 08:56

Message:
Logged In: YES 
user_id=1123998
Originator: NO

I don't know why the result is different in these two cases. In the case
of quoted-printable encoding, all non-us-ascii-printable bytes are encoded
as '=xx' where xx is the hex value of the corresponding byte. The
'decode=True' argument added by the patch causes these '=xx' codes to be
converted back to the original bytes for display in the admindb interface.

In the case of 8bit encoding, all bytes are represented as themselves and
the 'decode=True' argument does nothing. The result should be the same.

----------------------------------------------------------------------

Comment By: dudeua (dudeua)
Date: 2007-04-02 06:30

Message:
Logged In: YES 
user_id=1718234
Originator: YES

Hi msapiro,
Thanks for your patch! all work fine. but today i meet ne problem.
When mesasge comes with this headers:
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
All fine.

But when message is in this encoding:
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: 8bit

it's body is ugly.

Please advice me ...

Thanks!

----------------------------------------------------------------------

Comment By: Mark Sapiro (msapiro)
Date: 2007-02-13 10:53

Message:
Logged In: YES 
user_id=1123998
Originator: NO

The screenshot is unnecessary. I understood exactly what the issue was
from your original description.

I know you only see the problem in the "message excerpt" in the admindb
message detail. I tried to explain that the reason for this is that this
excerpt is not decoded. It would be possible to patch
Mailman/Cgi/admindb.py with the attached patch to decode this "message
excerpt" before displaying it, but then you would have the issue that the
characters in the box in your screenshot would be Windows CP1251 characters
which would probably still be garbled when displayed in the character set
of the rest of the page.

Also, my suggestion to set admin_immed_notify to 1 was not intended to
'correct' this display. It causes this message to be also sent to you in an
email so that if it is necessary for you to see the message text in order
to decide what action to take with the message, you can see the text in the
email notice.
File Added: admindb.patch.txt

----------------------------------------------------------------------

Comment By: dudeua (dudeua)
Date: 2007-02-13 08:55

Message:
Logged In: YES 
user_id=1718234
Originator: YES

set admin_immed_notify at /mailman/bun/config ? It already setted to 1;
Pleasee see attacged screenshot.
I can't use MUA (mail agent), because I meet problem with encoding only at
admin area.

Thanks for help.


File Added: ggg.jpg

----------------------------------------------------------------------

Comment By: Mark Sapiro (msapiro)
Date: 2007-02-13 08:36

Message:
Logged In: YES 
user_id=1123998
Originator: NO

Quoted-printable is not a character set. It is an encoding. That is, it is
a way of representing data which contains non-printable, us-ascii
characters using only the us-ascii character set as required by RFC2822 for
email messages.

The issue here is that encoded message bodies are not decoded for display
in the admindb message detail. This may change in the future, but then
there _will_ be character set and content-type issues. These are more
complicated as in general, different message parts may have different
character sets and may not even be text.

Currently, if you really need to see the decoded text, you need to set
admin_immed_notify to Yes so you get an email notice of the held message
and use an MUA (mail client) that shows you the decoded message in the
notice.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=100103&aid=1658920&group_id=103


More information about the Mailman-coders mailing list