[Mailman-i18n] Pipermail and non-English lists

James Henstridge james@daa.com.au
Thu Nov 21 15:36:12 2002


Barry A. Warsaw wrote:

>>>>>>"JH" == James Henstridge <james@daa.com.au> writes:
>>>>>>            
>>>>>>
>
>    JH> It looks like the default Apache 2.0 config includes this
>    JH> directive (and so do a number of distributors' apache-1.3
>    JH> packages), so we decided to switch back to outputting
>    JH> ISO-8859-1, and encoding other codepoints as character
>    JH> references.
>
>Ouch. :(
>
Well, our API docs are mainly 7-bit ascii with the occasional code point 
out of that range, so it wasn't that bad.  It would be much worse for 
mailing list archives for a Japanese list for instance, so it might not 
be the solution you want ...

>    
>    JH> The comments in the config file say that they add the charset
>    JH> to the content-type to work around security bugs in some web
>    JH> browsers.
>
>Yeah, but the online docs make no mention of this.  What specifically
>are the security vulnerabilities?
>  
>
There is a fairly lengthy comment above the directive in the default 
config file:

# Specify a default charset for all pages sent out. This is
# always a good idea and opens the door for future internationalisation
# of your web site, should you ever want it. Specifying it as
# a default does little harm; as the standard dictates that a page
# is in iso-8859-1 (latin1) unless specified otherwise i.e. you
# are merely stating the obvious. There are also some security
# reasons in browsers, related to javascript and URL parsing
# which encourage you to always set a default char set.

Of course, the comment about it being harmless is obviously wrong if you 
were expecting the charset to be detected from the document content. 
 There is some discussion of the problem as it relates to Apache here:
    http://httpd.apache.org/info/css-security/apache_specific.html

The idea is to prevent external data from setting the charset to 
something weird in order to sneak malicious code past content checks (it 
gives the example of UTF-7 encoded data, which might pass through many 
filters).  I don't think that it is an issue here, as the untrusted 
information (message body) can't affect the charset on the archive pages.

Since the AddDefaultCharset directive can be used in a <Directory> 
section of the apache config file.  We could include it in the sample 
httpd.conf fragment for setting up the archives.  Something like this:

    Alias /pipermail/ /home/mailman/archives/public/
    <Directory "/home/mailman/archives/public">
        Options Indexes FollowSymLinks
        AllowOverride None
        Order allow,deny
        Allow from all
        AddDefaultCharset Off
    </Directory>

Doing that and making a note of it in the upgrade notes should be enough.

James.

-- 
Email: james@daa.com.au              | Linux.conf.au   http://linux.conf.au/
WWW:   http://www.daa.com.au/~james/ | Jan 22-25   Perth, Western Australia. 







More information about the Mailman-i18n mailing list