[Spambayes] A most peculair problem with SB 1

Tony Meyer tameyer at ihug.co.nz
Wed Nov 10 02:57:23 CET 2004


> One of us is now very confused, and I suspect it has to be me.

I wouldn't bet against it being both of us <wink>.

> As I said, I couldn't start Spambayes in order to do any
> classifying, or anything else. Which means the error messages
> I posted are somewhat puzzling.

Troubling, to me, as I suspect that this means there's a bug somewhere.

> I wasn't aware of any option to specify paths: none of the
> documentation I've read on installing SB makes a reference
> to such an option for the full source version: and certainly
> none of the scripts seem to give any indication of a path-setting
> capability. 
[...]
> Can you please clarify where and how such a path option is available 
> and how it should be used, or point to documentation that explains how 
> this option is to be used.

There is confusion here, I think.  I was referring to various options, like
the database names, where you can specify a full path (e.g.
'C:\spambayes\hammie.db' or a relative one 'spambayes\hammie.db').  These
are all exposed via the web configuration - the help strings for these
options explain that if they are relative then they will be relative to the
configuration file.

[...]
> In this context
> 
> 1] is "working directory" intended to refer to the temporary
> directory from which one has just installed, or to the relevant
> Python\LIB\ sub directory or the Python\Script directory.

When I used "current working directory" (CWD) I meant according to the OS
(in Python terms, the result of 'os.getcwd()').  If the scripts are run from
a shell (DOS/CMD session), then it'll be whatever directory is currently
displayed at the prompt.  So doing:

  C:\Python23\Scripts> sb_server.py

The CWD would be C:\Python23\Scripts.  Doing:

  C:\Python23> Scripts\sb_server.py

The CWD would be C:\Python23.  The fact that it can change so easily (and is
somewhat unpredictable if not run from a shell) is the reason that
everything is (by default) located relative to the config file, which should
exist somewhere after an initial run, and a main reason that on Windows we
try to put the config file (and hence everything else) in the appropriate
Application Data directory.

When running as a service, I have absolutely no idea what would be the CWD.
However, you can't use the service unless the pywin32 extensions are
available, so the CWD would only come into play if there *was* a
bayescustomize.ini file there, which obviously won't be the case with a
fresh install.

> And if not these directories, then where might SB - as you put it
> - *expect* to find such a configuration file /apart/ from the 
> user data area?

SpamBayes will look for (in this order): any files specified in the
environment variable BAYESCUSTOMIZE, a file called "bayescustomize.ini" in
the CWD, a file called ".spambayesrc" in the home directory (this is
intended for non-Windows, though it will work, and I can't recall exactly
where each Windows flavour considers the home directory to be), and finally,
if on Windows and the pywin32 extensions are available, a file called
"Spambayes\Proxy\bayescustomize.ini" in the Windows Application Data
directory.

> 2] Having just done a parallel install onto a known-good and
> known-clean XP system, exercising some care I can now detect what
> should be happening: the process appears to be installing the proxy
> and DBs to 
> 
> %SYSTEMDRIVE%\Documents and Settings\CurrentUser\Application
> Data\Spambayes
> 
> Which is, in my view, a terribly bad place to be putting
> anything on a Windows 2000/2003 server, and is even worse when
> SB is or will be running as a service.
> 
> Any data subsequently entered into the DBs will not get
> backed up correctly and may also not be handled correctly by
> Shadow Volume services or by any user profile management which
> affects the user's data tree as implemented under 2000/2003 GP
> in Active Directory. 

Is this because it is located under the 'wrong' user?  This is only when
running as a service, correct?  I don't see how it otherwise applies (in
that case, being located there ensures that it can be backed up and handled
by user profile management).

> No Windows service should *ever* be installing or relying
> upon configuration or other data located in the *user* Data area
> and most especially not in the Administrator's data area or the
> data area of any user with administrator privileges. It's a
> fundamental breach of permissions and security and it's bad 
> Windows programming practice.

The rules were designed before the service version of sb_server existed.
The location is the appropriate place (I suppose you could debate whether
'Application Data' or 'Local Settings\Application Data' was more
appropriate) for a regular application to put its data.

It would be reasonably straightforward to add something to the
pop3proxy_service.py script so that when the service was installed an
initial bayescustomize.ini file (it could be empty, which would still work)
was created in a location in which it would be found (we can add one more
place to look if necessary, or just use the BAYESCUSTOMIZE environment
variable).  If this sounds like a suitable solution, where (in your opinion)
should data for a service be located by default?

> And finally: the reason that the "service.py install" failed
> on my subsequent attempts is quite simple: there appears to have
> already been a Spambayes folder in that location from my first
> attempt at an installation, and subsequent attempts to install
> failed to write a new copy.

That should not be the case.  SpamBayes tries to find the file, and if it
doesn't, then it checks if the directory exists, and if it doesn't, creates
it.  At least, this is what it is meant to do, and does here.  If it's doing
something else, then that's a bug.

> I didn't even consider looking in that area since nothing in the 
> documentation even suggested that this was where such data was 
> located.

The top of the configuration page (if it were possible to get that far, of
course) indicates where data is being stored (well, it says where the
configuration file is, which is a hint that everything else is there).  The
troubleshooting guide for sb_server
('spambayes\windows\docs\troubleshooting.html' in the source - it gets
installed if you use the binary).  When running as a service, the service
log should include a "using config file X" entry each time the service is
started.

I'm happy to put mention of this elsewhere (everyone is happy to admit that
the documentation needs work) if you can suggest a logical location.

> Previous versions of SB appear correctly and much more
> logically to have held such data within the SB directory tree.

This hasn't changed since September 16 2003, so "previous versions" would
have to be very old ones (1.0a5 and older).  If a "bayescustomize.ini" file
was found in one of the viable locations (as above), then the Application
Data directory would never have been used, though.

> > I presume that in the past you didn't run "setup.py install".
> 
> On the contrary: I did exactly that and had previous version
> of Spambayes running perfectly well - but on other hardware and
> with Windows 2000 Server.
> 
> However this is the first time that I've noticed Spambayes
> compile and copy itself into the Python tree. 

That's all that "setup.py install" does, so I presume it did it in the past
and you just didn't notice?

> In the past with older versions, once Python was installed in
> c:\python, I merely ran SB from an entirely different directory
> - e.g. c:\spambayes. One then ran the pop3 service from there.
> This no longer works once one has run the setup script.

This should still work without problems, and is a perfectly valid way to do
it.  What happens if you try this now?

[...]
> I have managed to get the pop3proxy_service installed
> and W2003 claims the service is running: however that service 
> bluntly refuses to proxy workstation mail clients [and there are
> no server logs to indicate what is going on]

SpamBayes ought to be creating logs if it is running as a binary.  They will
be in the temp directory called 'SpamBayesServiceX.log' (where X is from 1
to 5, with 1 being the most recent).  If it's running from source, then
output gets redirected to the win32trace collector (using PythonWin is the
easiest way to view this, although I think there's a standalone tool as
well).

> - the mail client however returns an error
[...]
> Things get worse: the only way to get at the configuration or
> training pages is to run sb_server.py in a local cmd session on
> the server
[...]

Hmm.  Trying the service myself, I can't get it regularly started when
running from source (CVS or 1.0).  It *does* work if I run in debug mode
(i.e. 'spambayes\windows\pop3proxy_service.py debug'), or if I use the 1.0
binary.  I'll look into why this is.

You're trying to connect to the web interface from the same machine that
it's running on, right?  If not, then you'll need to explicitly allow remote
access via the [pop3proxy] allow_remote_connections and [html_ui]
allow_remote_connections options.  (You can either do this manually, if you
identify the correct configuration file, by adding entries that look like:

[pop3proxy]
allow_remote_connections:130.123.238.51,130.123.238.52,localhost
[html_ui]
allow_remote_connections:130.123.238.51,localhost

or you can connect to the interface locally and do it via Advanced
Configuration page.  However, a flaw with 1.0 (fixed for new versions) means
that the three options (pop3proxy, smtpproxy, html_ui) are indistinguishable
- they are in that order, though).

If access doesn't work from the local machine, then either it's not running
at all, or something odd is happening (can't use that port, maybe?).  The
logs, if they can be found, should have some sort of explanation.

> Given that I can correctly install this same combination of 
> components to both a Windows 2000 Server and an XP workstation,
> and have it up and working in a matter of minutes, and that
> I'm not altogether inexpert with Windows systems, I can only
> conclude that the problem lies in the interaction of Windows 2003 
> and SpamBayes.

So you *only* experience problems when it's Windows 2003?  Not just any time
the new source is used?  If so, do you have any knowledge about things
Windows 2003 does differently?  (I know nothing at all about that).

If you run the pop3proxy_service.py script in debug mode, does it work?  If
so, I presume the problem is the same one that I have, and so once I figure
out what it is, all will be good.  If not, what errors does it give?

=Tony.Meyer

-- 
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



More information about the Spambayes mailing list