RE: [Spambayes] imap4 filter
I "searched" through the list of folders and found sever that had "&" in the name.
Yes, that would be the cause. This'll be fixed for the next version.
I didnt find any line with "available_folders"; I did find a line with "all_folders" which was 262 after the addition of the "import cgi" line.
That sounds about right. That function must have changed a bit since 1.0a7. A new release (source and binary) is due out pretty soon, so you'll be able to swap to that. See below, however.
After entering the 'training' configuration folders manually as instructed, I ran "python scripts\sb_imapfilter.py -t" in order to 'train' the filter. After about 40 minutes training the program bombed with the following messages: [...] File "c:\python23\lib\socket.py", line 301, in read data = self._sock.recv(recv_size) MemoryError
Hmm. How much training was this? 40 minutes sounds like a lot! Note that you can get really good results with only small amounts of training data. Maybe imapfilter isn't immediately releasing everything that it could (although I can't see anything apparent), or maybe the garbage collection just wasn't keeping up. Or maybe there's a *really* massive message that it choked on? (since it died reading from the socket). If you run it again, it shouldn't try and train any of those messages again - does it die immediately on the same message? (If so, then it's likely to be the latter). =Tony Meyer
----- Original Message ----- From: "Tony Meyer" <tameyer@ihug.co.nz> To: "'Neal Stoughton'" <nmstough@uci.edu>; <spambayes@python.org> Sent: Wednesday, December 31, 2003 14:31 Subject: RE: [Spambayes] imap4 filter
After entering the 'training' configuration folders manually as instructed, I ran "python scripts\sb_imapfilter.py -t" in order to 'train' the filter. After about 40 minutes training the program bombed with the following messages: [...] File "c:\python23\lib\socket.py", line 301, in read data = self._sock.recv(recv_size) MemoryError
Hmm. How much training was this? 40 minutes sounds like a lot! Note that you can get really good results with only small amounts of training data. Maybe imapfilter isn't immediately releasing everything that it could (although I can't see anything apparent), or maybe the garbage collection just wasn't keeping up. Or maybe there's a *really* massive message that it choked on? (since it died reading from the socket). If you run it again, it shouldn't try and train any of those messages again - does it die immediately on the same message? (If so, then it's likely to be the latter). I ran the training again and got exactly the same set of messages from the traceback; although as you predicted the memory error occurred relatively quickly (within 1 minute of the start). I have no idea at which point its dying. But I was training it on an inbox of about 1700 (ham) messages and a spam folder with about 250 messages. Interestingly the size of the "hammie.db" file decreased after this last running from over 400 kb to about 336 kb.
participants (2)
-
Neal Stoughton -
Tony Meyer