[Mailman-Users] Chinese characters spam filter?

Stephen J. Turnbull stephen at xemacs.org
Tue Jul 12 03:03:35 EDT 2016

Mark Sapiro writes:
 > On 7/8/16 6:04 PM, Yasuhito FUTATSUKI wrote:
 > > 
 > > How about using 'backslashreplace' instead of 'replace' to encode to
 > > list's preferred language in Mailman/Handlers/SpamDetect.py ?

I see you've already done this, but ...

I would consider xmlrefreplace as well.  xmlrefs are something most
people (users/moderators) have seen, backslash they're not going to
recognize unless they're programmers.

At an earlier stage, you could also just do a trial re-encoding with
the list preferred codec, set errors = 'strict', catch the Exception,
and re-raise as a Hold (or Discard, according to per-list policy).
(Then discard the output.)  I would prefer this solution, I think, as
creating regexps turns out to be an issue for many list owners.

People would have to learn not to use emoji in headers, of course, or
suffer moderation delays or even discards.

To the extent that this is only for the moderation interface, you
could also use UTF-8 for the UI.  Then the moderator would be able to
see the emoji, rather than the owner having to bake in such knowledge
in the regexps.


More information about the Mailman-Users mailing list