trying to strip out non ascii.. or rather convert non ascii

Tim Chase python.list at tim.thechases.com
Mon Oct 28 10:23:41 EDT 2013


On 2013-10-28 07:01, wxjmfauth at gmail.com wrote:
>> Simply ignoring diactrics won't get you very far.
> 
> Right. As an example, these four French words :
> cote, côte, coté, côté .

Distinct words with distinct meanings, sure.

But when a naïve (naive? ☺) person or one without the easy ability
to enter characters with diacritics searches for "cote", I want to
return possible matches containing any of your 4 examples.  It's
slightly fuzzier if they search for "coté", in which case they may
mean "coté" or they might mean be unable to figure out how to
add a hat and want to type "côté". Though I'd rather get more
results, even if it has some that only match fuzzily.

Circumflexually-circumspectly-yers,

-tkc





More information about the Python-list mailing list