[Spambayes-checkins] spambayes pop3proxy.py,1.75,1.76
Tim Stone
timstone4 at users.sourceforge.net
Sun Apr 20 18:36:26 EDT 2003
Update of /cvsroot/spambayes/spambayes
In directory sc8-pr-cvs1:/tmp/cvs-serv27694a
Modified Files:
pop3proxy.py
Log Message:
Useless imports and comments removed. Changes onStat to add a fixed
fudge factor to message sizes, rather than try to make some half-baked
attempt at guessing how much will be added. Stat is used by MUAs to
determine if a message is too large to download (today), and to display its
size on the user interface... that's about all, so a ballpark guess is not
particularly dangerous.
Index: pop3proxy.py
===================================================================
RCS file: /cvsroot/spambayes/spambayes/pop3proxy.py,v
retrieving revision 1.75
retrieving revision 1.76
diff -C2 -d -r1.75 -r1.76
*** pop3proxy.py 20 Apr 2003 05:00:27 -0000 1.75
--- pop3proxy.py 21 Apr 2003 00:36:23 -0000 1.76
***************
*** 51,64 ****
Web training interface:
- o Review already-trained messages, and purge them.
- o Include a Reply link that launches the registered email client, eg.
- mailto:tim at fourstonesExpressions.com?subject=Re:%20pop3proxy&body=Hi%21%0D
- o Keyboard navigation (David Ascher). But aren't Tab and left/right
- arrow enough?
- o [Francois Granger] Show the raw spambrob number close to the buttons
- (this would mean using the extra X-Hammie header by default).
- o Add Today and Refresh buttons on the Review page.
-
-
User interface improvements:
--- 51,54 ----
***************
*** 66,71 ****
o Deployment: Windows executable? atlaxwin and ctypes? Or just
webbrowser?
- o Can it cleanly dynamically update its status display while having a
- POP3 conversation? Hammering reload sucks.
o Save the stats (num classified, etc.) between sessions.
o "Reload database" button.
--- 56,59 ----
***************
*** 74,82 ****
New features:
- o "Send me an email every [...] to remind me to train on new
- messages."
- o "Send me a status email every [...] telling how many mails have been
- classified, etc."
- o Whitelist.
o Online manual.
o Links to project homepage, mailing list, etc.
--- 62,65 ----
***************
*** 105,140 ****
o NNTP proxy.
o Zoe...!
-
- Notes, for the sake of somewhere better to put them:
-
- Don't proxy spams at all? This would mean writing a full POP3 client
- and server - it would download all your mail on a timer and serve to you
- all the non-spams. It could be 'safe' in that it leaves the messages in
- the real POP3 account until you collect them from it (or in the case of
- spams, until you collect contemporaneous hams). The web interface would
- then present all the spams so that you could correct any FPs and mark
- them for collection. The thing is no longer a proxy (because the first
- POP3 command in a conversion is STAT or LIST, which tells you how many
- mails there are - it wouldn't know the answer, and finding out could
- take weeks over a modem - I've already had problems with clients timing
- out while the proxy was downloading stuff from the server).
-
- Adam's idea: add checkboxes to a Google results list for "Relevant" /
- "Irrelevant", then submit that to build a search including the
- highest-scoring tokens and excluding the lowest-scoring ones.
"""
! try:
! import cStringIO as StringIO
! except ImportError:
! import StringIO
!
! import os, sys, re, operator, errno, getopt, time, bisect, binascii
! import socket, asyncore, asynchat, cgi
! import mailbox, email.Header
from thread import start_new_thread
! from email.Iterators import typed_subpart_iterator
! import spambayes
! from spambayes import storage, tokenizer, mboxutils, Dibbler
from spambayes.FileCorpus import FileCorpus, ExpiryFileCorpus
from spambayes.FileCorpus import FileMessageFactory, GzipFileMessageFactory
--- 88,100 ----
o NNTP proxy.
o Zoe...!
"""
! import os, sys, re, errno, getopt, time
! import socket
from thread import start_new_thread
!
! import spambayes.message
! from spambayes import Dibbler
! from spambayes import storage
from spambayes.FileCorpus import FileCorpus, ExpiryFileCorpus
from spambayes.FileCorpus import FileMessageFactory, GzipFileMessageFactory
***************
*** 142,146 ****
from spambayes.UserInterface import UserInterfaceServer
from spambayes.ProxyUI import ProxyUserInterface
- import spambayes.message
# Increase the stack size on MacOS X. Stolen from Lib/test/regrtest.py
--- 102,105 ----
***************
*** 155,164 ****
resource.setrlimit(resource.RLIMIT_STACK, (newsoft, hard))
!
! # HEADER_EXAMPLE is the longest possible header - the length of this one
! # is added to the size of each message.
! HEADER_EXAMPLE = '%s: xxxxxxxxxxxxxxxxxxxx\r\n' % \
! options["Hammie", "header_name"]
!
class ServerLineReader(Dibbler.BrighterAsyncChat):
--- 114,119 ----
resource.setrlimit(resource.RLIMIT_STACK, (newsoft, hard))
! # number to add to STAT length for each msg to fudge for spambayes headers
! HEADER_SIZE_FUDGE_FACTOR = 512
class ServerLineReader(Dibbler.BrighterAsyncChat):
***************
*** 440,444 ****
if match:
count = int(match.group(1))
! size = int(match.group(2)) + len(HEADER_EXAMPLE) * count
return '+OK %d %d%s\r\n' % (count, size, match.group(3))
else:
--- 395,399 ----
if match:
count = int(match.group(1))
! size = int(match.group(2)) + HEADER_SIZE_FUDGE_FACTOR * count
return '+OK %d %d%s\r\n' % (count, size, match.group(3))
else:
***************
*** 496,500 ****
state.numUnsure += 1
! # Cache the message; don't pollute the cache with test messages.
if not state.isTest \
and options["pop3proxy", "cache_messages"]:
--- 451,455 ----
state.numUnsure += 1
! # Cache the message; don't pollute the cache with test messages.
if not state.isTest \
and options["pop3proxy", "cache_messages"]:
***************
*** 643,647 ****
# where they do not need to do any more regular training to
# be satisfied with spambayes' performance, we expire old
! # messages from not only the trained corpii, but the unknown
# as well.
self.spamCorpus.removeExpiredMessages()
--- 598,602 ----
# where they do not need to do any more regular training to
# be satisfied with spambayes' performance, we expire old
! # messages from not only the trained corpora, but the unknown
# as well.
self.spamCorpus.removeExpiredMessages()
More information about the Spambayes-checkins
mailing list