[spambayes-dev] Experimental SpamBayes build available

Tony Meyer tameyer at ihug.co.nz
Wed Dec 31 17:15:41 EST 2003


> I really don't like the code in Options.py that handles the 
> default values for these storage items.  I'm not sure it is 
> to blame, but it did cause me to see a new .db file created 
> in the cwd, rather than the data directory - as my INI file 
> already existed, it didn't get the default FQN for the new option.

The idea was that if you already had an ini file, then you had already set
things up (you were an existing user), and so we wouldn't want to fiddle
with your setup, whatever it was, because that might mean that we lose track
of your databases.  (This could even be with the default "hammie.db" in the
cwd for persistent_storage_file, if the user always runs the script from the
same directory).

Part of the problem is that when consolidating the storage name options, I
picked "hammie.db" over "~/.hammiedb", which was by far the better option.
I didn't realise that it would expand quite nicely on Windows (well, 2k and
XP; I presume earlier as well), and I presume on Macs as well.

> IMO, the ini files should generally store relative path 
> names, being relative to the directory of the config file 
> being used.

+1.  I can go through and check this in if you like (once others have ripped
into the idea ;).

> Code speaks louder than words - I'm suggesting:
[...]
> The default remains "statistics_database.db".

The proper default, of course, is still "hammie.db".  When I put that code
in to put things into a better place for Windows users I figured that since
they wouldn't have an existing db, a more easily understandable name would
be good, too, but I didn't think that I could change it for everyone.  Do we
continue setting this option to "statistics_database.db" in that place?
(Without the FQN).  The same code also gives new default names for the cache
directories and messageinfo db.

> All code that uses this option
> ('persistent_storage_file') does so via a new function:

Presumably also these:
  [URLRetriever] x-cache_directory
  [Storage] messageinfo_storage_file
  [Storage] spam_cache
  [Storage] ham_cache
  [Storage] unknown_cache

What about these?
  [TestDriver] spam_directories
  [TestDriver] ham_directories

> def get_pathname_option(section, value):
>   filename = options.get(section, value)
>   if not os.path.isabs(filename):
>     return filename

Shouldn't that be "if os.path.isabs(filename):"?

>   # maybe expanduser() to *nix?

I think this is necessary, yes, but before this (because
os.path.isabs("~/hammie.db") is False).

What about this?

def get_pathname_option(section, value):
    filename = os.path.expanduser(options.get(section, value))
    if os.path.isabs(filename):
        return filename
    return os.path.join(os.path.dirname(optionsPathname), # existing global
                        filename)

How do people feel about having this happen implictly when one of these
options is used, rather than explicitly?  (I worry that we'll miss an
occurrence of one of them, or that someone (maybe me!) will add new code and
forget to use the get_pathname_option function).  Something like this:

[Current code in OptionsClass.py]
    def get(self):
        '''Get option value.'''
        return self.value

[Proposed]
    def expand_path(value):
        filename = os.path.expanduser(value)
        if os.path.isabs(filename):
            return filename
        return os.path.join(os.path.dirname(optionsPathname), # existing
global
                            filename)

    def get(self):
        '''Get option value.

        If the option is a path, then get relative to the
        configuration file.'''
        if self.allowed_values in [PATH, FILE_WITH_PATH,]: # maybe also
VARIABLE_PATH?
            return self.expand_path(self.value)
        return self.value

=Tony Meyer




More information about the spambayes-dev mailing list