trying to strip out non ascii.. or rather convert non ascii
Tim Chase
python.list at tim.thechases.com
Mon Oct 28 10:23:41 EDT 2013
On 2013-10-28 07:01, wxjmfauth at gmail.com wrote:
>> Simply ignoring diactrics won't get you very far.
>
> Right. As an example, these four French words :
> cote, côte, coté, côté .
Distinct words with distinct meanings, sure.
But when a naïve (naive? ☺) person or one without the easy ability
to enter characters with diacritics searches for "cote", I want to
return possible matches containing any of your 4 examples. It's
slightly fuzzier if they search for "coté", in which case they may
mean "coté" or they might mean be unable to figure out how to
add a hat and want to type "côté". Though I'd rather get more
results, even if it has some that only match fuzzily.
Circumflexually-circumspectly-yers,
-tkc
More information about the Python-list
mailing list