[Spambayes] Supporting new database type in classifier
bkc at murkworks.com
Mon Feb 16 09:52:10 EST 2004
On 14 Feb 2004 at 23:25, Tim Peters wrote:
> You'll probably get better responses on the spambayes-dev list.
Ah, I must have missed the announcement of that list when it was created.
> I encourage you to work on a branch for now -- since most people drop most
> ideas after a few weeks at most, I'm opposed to warping this part of the
> code to cater to something as unlikely to be seen again as a
> non-random-access database model. If you work on a branch and demonstrate
> astonishing results, great, then we'll junk all other storages and adopt
> yours <wink>.
Well ok, except I wasn't asking about the mechanics of putting my code into the tree,
but rather, what's the best way to refactor Classifier so this would be easier to do.
> > I could override _getclues, but then I'd have to recreate the
> > bigram stuff which is quite a lot.
> It's less than 30 lines of code (half of it is comments).
But then that code would be duplicated. So at some point (assuming I don't fade away),
we'll only want one copy of the bigram synthesis code. That's the basis of my question,
what's the best way to re-arrange the existing code?
> > Second, what's the best way to restructure classifier so that a
> > storage subclass can deal with entire wordstreams in one lump if
> > it so chooses?
> On a branch -- prove this is worth doing first, and don't worry about doing
> it cleanly before that succeeds.
heh heh. You're not answering my question.. ;-)
I'll be back in touch with my dirty proof of concept.
Brad Clements, bkc at murkworks.com (315)268-1000
http://www.murkworks.com (315)268-9812 Fax
http://www.wecanstopspam.org/ AOL-IM: BKClements
More information about the Spambayes