[Spambayes] Installation error + Script error

Tony Meyer tameyer at ihug.co.nz
Sat Jan 14 10:08:37 CET 2006


> when i did a python setup.py install, i got this error:
>
> christopher-isaiah-funs-imac-g5:~/desktop/code wwjd2525$ python  
> setup.py install
> [...]
> error: /System/Library/Frameworks/Python.framework/Versions/2.3/bin/ 
> sb_bnfilter.py: Permission denied
>
> Any idea what can I do to get around it? I'm really new in all  
> this. =\

A normal user doesn't have permission to write to /System/Library.   
You could:

   1.  Do "sudo python setup.py install" (and give it your  
administrator password when it asks), if you are fine with adding the  
spambayes package to the system-wide Python installation (the build  
script will output what goes where, so you could save that ("sudo  
python setup.py install > install.log") if you wanted to be sure you  
could remove things later.

   2.  Install things elsewhere; this would be particularly good if  
you're likely to use other Python packages in the future, and would  
rather keep the system clean and keep third-party things in (e.g.)  
your home folder.  "python setup.py install --help" gives you  
options; from memory you want something like "python setup.py install  
--root /Users/{username/" to base things in your home directory.   
You'll then have to tell Python where things can be found, by setting  
the environment variable PYTHONPATH to include that location.  (Ask  
if you want more specifics about this).

   3.  You can leave the files wherever they are, and just tell  
Python where they are via the PYTHONPATH environment variable (again,  
ask if you want specifics).  This is certainly the least disruptive,  
since you can then just delete the folder and everything is gone.

> 2. Also, I tried running a script that was provided in the package,  
> called sb_mailsort.py. I did a python sb_mailsort.py -s 19602.elmx  
> (where 19602.elmx) is the e-mail I'm trying to classify,

I expect you will have trouble trying to use elmx files, as they're  
not standard mbox files (which Mail 1.x used, I believe).  They are  
simple to parse, however.  It would be no trouble to add handling  
emlx files to the scripts if you'd like that.

> and I got this
>
> Traceback (most recent call last):
>   File "sb_mailsort.py", line 187, in ?
>     main()
>   File "sb_mailsort.py", line 181, in main
>     print_message_score(msg, open(msg))
>   File "sb_mailsort.py", line 141, in print_message_score
>     bayes = CdbClassifier(open(DB_FILE, 'rb'))
> IOError: [Errno 2] No such file or directory: '/Users/ 
> wwjd2525/.spambayes/wordprobs.cdb'
>
> Is this due to the fact that I've not installed my spambayes properly?

What were you trying to use sb_mailsort.py for?  IIRC it works with  
CDB databases and Maildir mailboxes; I'm not sure if it's really used  
much by anyone.  Sticking with the regular scripts is probably  
easiest (see below).

> 3. Lastly, (I really appreciate whoever who answers these  
> questions) is there python module that specifically tokens a mail  
> and tells me how many ham and spam words are in there?

The sb_filter script takes a message (or mbox or MH file) and outputs  
the message(s) with an "X-Spambayes-Classification" header, which  
will have a value of "ham", "spam", or "unsure".  You can turn on  
options to have other headers, such as the evidence (tokens) found/ 
used, and so forth.  Is that what you're after?  If you want to just  
tokenize the message, then you'll have to use a custom Python script,  
e.g.:

  >>> from spambayes.tokenizer import tokenize
  >>> print list(tokenize(message_as_string))

Likewise, if you want the tokens and their scores:

  >>> from spambayes.tokenizer import tokenize
  >>> from spambayes.storage import open_storage
  >>> c = open_storage("~/.hammiedb", "dbm")
  >>> score, tokens = c.spamprob(tokenize(message_as_string), True)

[all untested, but it should be right]

> How do I train my sb using ham and spam corpuses without disturbing  
> any settings on my computer.

How do you get your mail?  From the reference to emlx files, I  
presume you're using Mail.  Are you connecting with Mail to one or  
more POP servers?  One or more IMAP servers?  Or something more  
unix'y, like using fetchmail?  (For POP3, then using sb_server.py (a  
POP3 proxy) is the best idea, for IMAP, then sb_imapfilter.py, for  
unix'y things, probably sb_mboxtrain.py and sb_filter.py).

=Tony.Meyer

-- 
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.




More information about the SpamBayes mailing list