[Python-Dev] The first trustworthy <wink> GBayes results

Delaney, Timothy tdelaney@avaya.com
Mon, 2 Sep 2002 10:38:10 +1000


> From: Delaney, Timothy [mailto:tdelaney@avaya.com]
>
> Whether any weighting should be applied to single words or 
> word pairs I
> don't know - my gut feeling is that they should be weighted 
> the same, but
> guts are no replacement for empirical evidence.

On second thought - if a word-pair appears, then the separate parts should
not be checked as separate words.

So, If I had scores:

    'free'              0.1
    'beer'              0.1
    ('want', 'free',)   0.9
    ('free', 'beer',)   0.01
    ('free', '!!!',)    0.99

then the following phrases would match (case-folding) as:

    'I want free beer!!!':

    ('want', 'free',)   0.9
    ('free', 'beer',)   0.01

    'Get *** for free!!!'

    ('free', '!!!',)    0.99

    'I want free beer. Free the beer!!!'

    ('want', 'free',)   0.9
    ('free', 'beer',)   0.01
    'free'              0.1
    'beer'              0.1

Damn I wish I was at home to try this out ... :(

Tim Delaney