Working on sending MIME emails through Mailman, I noticed that some of the translations are inconsistent in how they use HTML entity escapes.
This becomes a problem when sending email. An example from the Spanish translation:
#: Mailman/Cgi/create.py:221 bin/newlist:204 msgid "Your new mailing list: %(listname)s" msgstr "Su nuebva lista de distribución: %(listname)s"
This is a real problem, because this string is sent literally -- with the string "ó" -- as the subject of the new email message.
I looked in the HTML 4.01 standard and found that HTML entities are actually only intended to be used when the document's character set does not support that particular character.
http://www.w3.org/TR/html401/charset.html has more information on this.
Since Mailman's CGI interface (in almost all cases) sends the correct charset in the Content-Type header, I think it's not necessary to use HTML entity escapes in the gettext catalog files. In fact, when we do use escapes, it makes text emails generated by Mailman illegible.
Does anyone have any comments? I would like to go through the catalogs and change the HTML escapes back into the original characters, so that emails Mailman generates are correct again. The CGI interface will still work as before.
Here is a first guess at which translations include HTML escapes besides < > and :
[ben@nausicaa:~/src/mailman/mailman/messages]% egrep '&[^;]+;' **/*.po | egrep -v ' |<|>' | cut -d : -f 1 | uniq
es/LC_MESSAGES/mailman.po it/LC_MESSAGES/mailman.po no/LC_MESSAGES/mailman.po
So, the changes would only actually apply to the Spanish, Italian, and Norwegian translations. The rest of the translations are correctly in their original character sets.