Case-insensitive string equality
Steve D'Aprano
steve+python at pearwood.info
Fri Sep 1 09:22:30 EDT 2017
On Fri, 1 Sep 2017 09:53 am, MRAB wrote:
> What would you expect the result would be for:
>
> "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("F")
>
> "\N{LATIN SMALL LIGATURE FI}".case_insensitive_find("I)
That's easy.
-1 in both cases, since neither "F" nor "I" is found in either string. We can
prove this by manually checking:
py> for c in "\N{LATIN SMALL LIGATURE FI}":
... print(c, 'F' in c, 'f' in c)
... print(c, 'I' in c, 'i' in c)
...
fi False False
fi False False
If you want some other result, then you're not talking about case sensitivity.
If anyone wants to propose "normalisation-insensitive matching", I'll ask you to
please start your own thread rather than derailing this one with an unrelated,
and much more difficult, problem.
The proposal here is *case insensitive* matching, not Unicode normalisation. If
you want to decompose the strings, you know how to:
py> import unicodedata
py> unicodedata.normalize('NFKD', "\N{LATIN SMALL LIGATURE FI}")
'fi'
--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.
More information about the Python-list
mailing list