[Mailman-Users] Dealing with multiple charsets (list messages and web archive)
Stefan Förster
cite at incertum.net
Sun May 11 20:38:11 CEST 2008
Hello Mark,
first of all, thank you very much for your help. This looks very
promising indeed.
* Mark Sapiro <mark at msapiro.net> wrote:
> What you want is more like the attached flatten.py.txt file (.txt added
> for content filtering). Note that this is far from production quality
> and probably doesn't even work on some messages.
I will perform a full set of tests then - would have done anyways.
Thanks for the warning, though.
> Problems I am aware of are things like
>
> - no i18n for canned text strings
Hm, I think I can handle that. After all, you already showed me how to
do this ;-)
> - signatures will get broken
What kind of signatures do you mean?
> - with multipart/alternative, the text/plain part will be aggregated
> with the other text/plain parts and the text/html or other
> alternatives will be separately attached.
If this handler is called after MimeDel or Scrubber, there should be
no more text/html parts left in the message. But then again, I'm not
sure about that yet. Need to do more reading, I'm not sure yet where
to add flatten.py.
> - text/plain parts without a specified charset will not be aggregated
> but will be separately attached. This is a difficult issue because
> many mainstream MUAs will attach an arbitrary .txt attachment without
> specifying a charset. If you then assume it is say iso-8859-1 and
> convert it to unicode and in fact it was euc-jp or koi8-r or even
> utf-8, you can garble it irreversably.
If a .txt file without encoding is attached, it is always look if the
receiver will be able to read the file. I'd say "gzip it". Really.
> flatten.py is written so that it could be installed as is in Mailman as
> a custom Handler.
I will try this out tomorrow.
> Note that this will not address separate attachment of headers and
> footers. If the resultant 'flattened' message is multipart for any
> reason, msg_header and msg_footer will still be attached as separate
> MIME parts.
After rebuilding the text parts, could we call "decorate" on the
message before we attach any other parts?
> The basic flow in the process is
[very clear explanation cut]
I think I'm beginning to like Python.
Cheers
Stefan
--
Stefan Förster http://www.incertum.net/ Public Key: 0xBBE2A9E9
FdI #186: Admin-Handy - Elektronisches Würgehalsband (Holger Köpke)
More information about the Mailman-Users
mailing list