RE: [Python-Dev] The first trustworthy <wink> GBayes results
From: Tim Peters [mailto:tim.one@comcast.net]
I'm going to say a lot of stuff here, and then shut up <wink>. I want to move on to other things, but there's an opportunity to pass on some darned good advice for those who can hear.
Pretty darned good advice too ... but you won't object if I waste some time playing with this stuff anyway I hope. Only one way to accumulate experience after all ;) Personally, I considered that you were already well past the point of diminishing returns, and anything further was of academic interest to those who felt a desire to tinker ... (i.e. the hard work has been done, and everything else is just fun and games :) If enough people (or just one dedicated person) waste enough time, who knows what may come out. Hey - it worked for timsort didn't it ...? ;) Tim Delaney
[Delaney, Timothy]
Pretty darned good advice too ... but you won't object if I waste some time playing with this stuff anyway I hope. Only one way to accumulate experience after all ;)
Not at all! Knock yourself out -- it's really a lot of fun, except when it gets so tedious you start punching the wall just to watch your knuckles bleed <wink>.
Personally, I considered that you were already well past the point of diminishing returns,
Not yet -- false positives are a horrible thing, and the false negative rate still lets a lot of spam through. Cutting the f-n rate, e.g., in half, would mean half as much spam to deal with; generalization left to the reader.
and anything further was of academic interest to those who felt a desire to tinker ...
The best hope for reducing f-n lies in exploiting more header lines than I can test with my mixed corpora, and there's *tons* of room for improvement there (note that the f-n rate is more than 20x greater than the f-p rate now). Anyone who wants to tackle that with tedious experiment should first pick Neil Schemenauer's brain: he had a good start on that early last week.
(i.e. the hard work has been done, and everything else is just fun and games :) If enough people (or just one dedicated person) waste enough time, who knows what may come out. Hey - it worked for timsort didn't it ...? ;)
Indeed so, and it works for this too -- never underestimate the power of working yourself sick. If you also *write* about it, you can make everyone else ill too by proxy <wink>. sharing-the-pain-ly y'rs - tim
participants (2)
-
Delaney, Timothy
-
Tim Peters