Case-insensitive string equality
Tim Chase
python.list at tim.thechases.com
Thu Aug 31 11:29:29 EDT 2017
On 2017-08-31 07:10, Steven D'Aprano wrote:
> So I'd like to propose some additions to 3.7 or 3.8.
Adding my "yes, a case-insensitive equality-check would be useful"
with the following concerns:
I'd want to have an optional parameter to take locale into
consideration. E.g.
"i".case_insensitive_equals("I") # depends on Locale
"i".case_insensitive_equals("I", Locale("TR")) == False
"i".case_insensitive_equals("I", Locale("US")) == True
and other oddities like
"ß".case_insensitive_equals("SS") == True
(though casefold() takes care of that later one). Then you get
things like
"III".case_insensitive_equals("\N{ROMAN NUMERAL THREE}")
"iii".case_insensitive_equals("\N{ROMAN NUMERAL THREE}")
"FI".case_insensitive_equals("\N{LATIN SMALL LIGATURE FI}")
where the decomposition might need to be considered. There are just
a lot of odd edge-cases to consider when discussing fuzzy equality.
> (1) Add a new string method,
This is my preferred avenue.
> Alternatively: how about a === triple-equals operator to do the
> same thing?
No. A strong -1 for new operators. This peeves me in other
languages (looking at you, PHP & JavaScript)
> (2) Add keyword-only arguments to str.find and str.index:
>
> casefold=False
>
> which does nothing if false (the default), and switches to a
> case- insensitive search if true.
I'm okay with some means of conveying the insensitivity to
str.find/str.index but have no interest in list.find/list.index
growing similar functionality. I'm meh on the "casefold=False"
syntax, especially in light of my hope it would take a locale for the
comparisons.
> Unsolved problems:
>
> This proposal doesn't help with sets and dicts, list.index and the
> `in` operator either.
I'd be less concerned about these. If you plan to index a set/dict
by the key, normalize it before you put it in. Or perhaps create a
CaseInsensitiveDict/CaseInsensitiveSet class. For lists and 'in'
operator usage, it's not too hard to make up a helper function based
on the newly-grown method:
def case_insensitive_in(itr, target, locale=None):
return any(
target.case_insensitive_equals(x, locale)
for x in itr
)
def case_insensitive_index(itr, target, locale=None):
for i, x in enumerate(itr):
if target.case_insensitive_equals(x, locale):
return i
raise ValueError("Could not find %s" % target)
-tkc
More information about the Python-list
mailing list