[Mailman-i18n] HTML entities (é) in es, it, no translations

Ben Gertzfield che@debian.org
Thu, 31 Jan 2002 20:31:59 +0900


>>>>> "Martin" == Martin von Loewis <loewis@informatik.hu-berlin.de> writes:

    Martin> - for this to work, Mailman needs to properly declare the
    Martin> encoding of each generated HTML page, and the declaration
    Martin> needs to match the actual content. For Latin-1, this is
    Martin> not strictly necessary, since that is the default encoding
    Martin> of HTML, anyway, but there may be plans to move to XHTML
    Martin> some day, at which time even this assumption breaks.

Actually, to be precise, HTML 4.01's native encoding is Unicode,
which Latin-1 happens to be a (very small) subset of.

    Martin> - Problems will arise if Mailman inserts strings from
    Martin> various sources into the same template, especially if
    Martin> these use different encodings.  If that can ever happen,
    Martin> you need to recode all strings to the same encoding. If
    Martin> that fails (e.g. because the encoding is unknown, or
    Martin> because the string cannot be represented in the encoding),

Right now, I don't think Mailman does that anywhere.  If it does,
I think the best thing to do is to convert to Unicode.

Unfortunately, as much as I'd like, we can't make *everything* 
Unicode, because a lot of older browsers still don't support it.

    Martin>   This document is encoded in ISO-8859-9 (for Turkish);
    Martin> but it still contains French accepts. Using entities is
    Martin> the only choice here, short of using UTF-8 for the entire
    Martin> page.

Yes.  This kind of issue will come up only in two places in Mailman:

1) on the admin request page (for bounce handling, etc)

2) in the archives (a pipermail issue)

    Martin> Unfortunately, not all encodings in mailman are supported
    Martin> (the East Asians ones are missing). In general, I'd
    Martin> encourage usage of Unicode throughout in mailman, even if
    Martin> this means that additional codecs must be bundled with the
    Martin> distribution.

Which East Asian ones are missing?  Mailman CVS works beautifully
for me with Japanese, and the screenshot I sent earlier today shows
Chinese (both simplified and traditional) working in email.

Barry and I have talked a lot about bundling codecs with Mailman,
and he's agreed with me that we need to do it.  The Japanese codec
is in a good state and will be easy enough to ship; the Chinese
ones are only available in CVS that I know of, so we will need to
make a proper distribution.

Ben

-- 
Brought to you by the letters T and N and the number 12.
"Hoosh is a kind of soup."
Debian GNU/Linux maintainer of Gimp and Nethack -- http://www.debian.org/