trying to strip out non ascii.. or rather convert non ascii
wxjmfauth at gmail.com
wxjmfauth at gmail.com
Thu Oct 31 06:33:15 EDT 2013
Le jeudi 31 octobre 2013 08:10:18 UTC+1, Steven D'Aprano a écrit :
> On Wed, 30 Oct 2013 01:49:28 -0700, wxjmfauth wrote:
>
>
>
> >> The right solution to that is to treat it no differently from other
>
> >> fuzzy
>
> >> searches. A good search engine should be tolerant of spelling errors
>
> >> and
>
> >> alternative spellings for any letter, not just those with diacritics.
>
> >> Ideally, a good search engine would successfully match all three of
>
> >> "naïve", "naive" and "niave", and it shouldn't rely on special handling
>
> >> of diacritics.
>
> >
>
> > This is a non sense. The purpose of a diacritical mark is to make a
>
> > letter a different letter. If a tool is supposed to match an ô, there is
>
> > absolutely no reason to match something else.
>
>
>
>
>
> I'm glad that you know so much better than Google, Bing, Yahoo, and other
>
> search engines. When I search for "mispealled" Google gives me:
>
>
>
> Showing results for misspelled
>
> Search instead for mispealled
>
>
>
>
>
> But I see now that this is nonsense and there is *absolutely no reason*
>
> to match something other than the ecaxt wrods I typed.
>
>
>
> Perhaps you should submit a bug report to Google:
>
>
>
> "When I mistype a word, Google correctly gives me the search results I
>
> wanted, instead of the wrong results I didn't want."
>
>
>
>
>
>
>
> --
>
> Steven
As far as I know, I recognized my mistake. I had more
text processing systems in mind, than search engines.
I can even tell you, I am really stupid. I wrote pure
Unicode software to sort French or German strings.
Pure unicode == independent from any locale.
jmf
More information about the Python-list
mailing list