[Python-Dev] The first trustworthy <wink> GBayes results
Mon, 2 Sep 2002 08:53:39 +1000
> From: Tim Peters [mailto:email@example.com]
> Training GBayes is cheap, and the more you feed it the less need to do
> information-destroying transformations (like folding case or ignoring
Speaking of which, I had a thought this morning (in the shower of course ;)
about a slightly more intelligent tokeniser.
Split on whitespace, then runs of punctuation at the end of "words" are
split off as a separate word.
a.b.c -> 'a.b.c' (main use: keeps file extensions with filenames)
A phrase. -> 'A', 'phrase', '.'
WTF??? -> 'WTF', '???'
>>> import module -> '>>>', 'import', 'module'
Might this be useful? No code of course ;)