[ mailman-Patches-891491 ] Scrubber.py patch

SourceForge.net noreply at sourceforge.net
Mon Sep 13 20:23:20 CEST 2004


Patches item #891491, was opened at 2004-02-06 02:26
Message generated for change (Comment added) made by ber
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=300103&aid=891491&group_id=103

Category: Pipermail
Group: Mailman 2.1
Status: Open
Resolution: None
Priority: 8
Submitted By: Tokio Kikuchi (tkikuchi)
Assigned to: Nobody/Anonymous (nobody)
Summary: Scrubber.py patch

Initial Comment:
Scrubber.py has number of bugs for processing various     
types of attachment and languages and many have     
submitted patches to fix them. This patch item is     
opened to collect such patches for convenience.     
     
This patch corrects:     
     
- if an attached text is composed by win notepad, it     
has no charset specified and actual charset may be     
different from message/list charset. This sometimes     
cause error in composing digest message.     
     
- sometimes, null charset is represented by '' as     
well as None.     
    
- embedded rfc-2822 message is lost if you don't use   
msg.walk()   
   
- special problem with japanese charsets.  
  
- t (stringfied part) may be None which you can't  
append a '\n'.  
  
  
 
  
  

----------------------------------------------------------------------

Comment By: Bernhard Reiter (ber)
Date: 2004-09-13 20:23

Message:
Logged In: YES 
user_id=113859

The patch looks better, but I probably 
can only test it when I updated my Mailman patches again.
I have directly send you an email that should trigger the bug
so you can test.

----------------------------------------------------------------------

Comment By: Tokio Kikuchi (tkikuchi)
Date: 2004-09-12 01:45

Message:
Logged In: YES 
user_id=67709

Sorry for the problem. I have updated this patch (2004-09-12
version) Please try and report this patch because I can't
fully reproduce the problem (in Japanese environment).



----------------------------------------------------------------------

Comment By: Bernhard Reiter (ber)
Date: 2004-09-11 20:04

Message:
Logged In: YES 
user_id=113859

scrubber.patch 2004-08-24
has a problem.

Charset(lcset).output_charset  can return None according
to the documentation. E.g. when no conversion is needed.
This would assign Non to lcset_out .

Later this leads to an exception in
try:
                    if len(charset) == 0:
                        charset = 'us-ascii'
                    t = t.encode(charset, 'replace')
                except (UnicodeError, LookupError, ValueError):
                    t = t.encode(lcset, 'replace')
because:
TypeError: len() of unsized object



----------------------------------------------------------------------

Comment By: Tokio Kikuchi (tkikuchi)
Date: 2004-08-24 09:58

Message:
Logged In: YES 
user_id=67709

added a fix for the case of lcset_out <> lcset.



----------------------------------------------------------------------

Comment By: Bernhard Reiter (ber)
Date: 2004-04-14 20:53

Message:
Logged In: YES 
user_id=113859

Thanks for working on the patch. 

----------------------------------------------------------------------

Comment By: Tokio Kikuchi (tkikuchi)
Date: 2004-02-25 11:42

Message:
Logged In: YES 
user_id=67709

uploading revised patch. Now fixes a few more bugs which try
to decode scrub plain text message and result in mojibake.
Also, japanese filename tend to become so long that system
limit may exceeded because of mime encoding, so add an
option not to use the filename in the message but to use
'attachment' as filename. Because this patch spans two files
(Defaults.py.in and Handlers/Scrubber.py) you have to cd
mailman and patch -p1 < this_patch. (Well, I think it is
-p1. If it didn't work, try -p0 ;-)


----------------------------------------------------------------------

Comment By: Jonathan Larmour (jifl)
Date: 2004-02-22 18:25

Message:
Logged In: YES 
user_id=817601

I strongly recommend applying this patch. I received a mail
bounce on a list with an empty charset in a part (i.e.
"charset=") and it caused /var/mailman/cron/senddigest and
thus all digest processing to fail because of this error:

Traceback (most recent call last):
  File "/var/mailman/cron/senddigests", line 94, in ?
    main()
  File "/var/mailman/cron/senddigests", line 86, in main
    mlist.send_digest_now()
  File "/var/mailman/Mailman/Digester.py", line 60, in
send_digest_now
    ToDigest.send_digests(self, mboxfp)
  File "/var/mailman/Mailman/Handlers/ToDigest.py", line
123, in send_digests
    send_i18n_digests(mlist, mboxfp)
  File "/var/mailman/Mailman/Handlers/ToDigest.py", line
295, in send_i18n_digests
    msg = scrubber(mlist, msg)
  File "/var/mailman/Mailman/Handlers/Scrubber.py", line
308, in process
    t = t.encode(charset, 'replace')
  File "/usr/lib/python2.2/encodings/__init__.py", line 51,
in search_function
    mod = __import__(modname,globals(),locals(),'*')

which is something this patch fixes.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=300103&aid=891491&group_id=103


More information about the Mailman-coders mailing list