[Mailman-Users] ISO-8859-1/Latin1 vs UTF-8

Bernd Petrovitsch bernd at firmix.at
Mon Oct 31 23:45:09 CET 2005


On Mon, 2005-10-24 at 14:05 -0700, Mark Sapiro wrote:
> Bernd Petrovitsch wrote:
> >I actually reported a bug (though it may not sound so): I enter
> >(apparently) UTF-8 text (with Firefox it that is important) and it comes
> >back disguised (and as part of) ISO-8859-1 text.
> >The question is: Which part is doing something wrong and how to fix it?
> 
> What happens here is that Mailman creates the web page with the META
> tag in the header
> 
> <META http-equiv="Content-Type" content="text/html; charset=xxxx">
> 
> where xxxx is the encoding of the language of the list (default
> iso-8859-1 for German), but the web server sends its own http
> Content-Type: header specifying charset=utf-8. For reasons I don't
> understand, the HTML standard says the server provided Content-Type:
> charset takes priority over that specified by an HTML META tag.

I don't understand it either but it is so. BTW I usually disable the
feature in the webserver config.

> Thus your browser sets it's encoding as utf-8, but mailman thinks what
> it gets back is iso-8859-1 and thus garbles the multibyte unicode
> sequences.
> 
> It can be fixed by setting the 'German' character set to utf-8 and
> recoding the German language templates, messages and list archives in
> utf-8 as discussed in the archive threads I mentioned previously.

Done. I have now a German and an English template both specifying UTF-8
as charset *and* UTF-8 text in there (especially in the German one).
But the crazy thing ist that the English page is - according to "Page
Info" in Firefox and on the shell with `wget --post-data="language=de"
-S https://lists.funkfeuer.at/mailman/listinfo/user` - delivered as
"UTF-8" and the German one as "ISO-8859-1" as you (and everybody else)
can see on  https://lists.funkfeuer.at/mailman/listinfo/user.
The German summary on both pages has been entered through the web
interface of the list administrator.

> Alternatively, it can be addressed in the web server by configuring it
> so it doesn't specify these documents as utf-8.

This is IMHO the case.
----  snip  ----
711#grep AddDef /etc/apache2/apache2.conf
AddDefaultCharset       off
----  snip  ----

	Bernd
-- 
Firmix Software GmbH                   http://www.firmix.at/
mobil: +43 664 4416156                 fax: +43 1 7890849-55
          Embedded Linux Development and Services






More information about the Mailman-Users mailing list