[Tutor] String comparison

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Thu, 8 Aug 2002 03:43:50 -0700 (PDT)


On Thu, 8 Aug 2002, Yigal Duppen wrote:

> Python has no command for comparing strings irrespective of case; the
> usual approach is to lowercase (or uppercase) the strings before
> comparing them:
>
> >>> a = "a"
> >>> b = "B"
> >>> a < b
> 0
> >>> a.lower() < b.lower()
> 1
> >>> a.upper() < b.upper()
> 1
> >>> a
> 'a'
> >>> b
> 'B'
>
> As you can see, both upper and lower return copies; they leave the
> original strings intact. Strings are immutable objects in Python.


By the way, in the Java language, there is a "equalsIgnoreCase()"
function, but it's probably not as clever as you might expect.  At least,
in the GNU GCJ Java implementation, here's what it looks like:

/******/
boolean
java::lang::String::equalsIgnoreCase (jstring anotherString)
{
  if (anotherString == NULL || count != anotherString->count)
    return false;
  jchar *tptr = JvGetStringChars (this);
  jchar *optr = JvGetStringChars (anotherString);
  jint i = count;
  while (--i >= 0)
    {
      jchar tch = *tptr++;
      jchar och = *optr++;
      if (tch != och
	  && (java::lang::Character::toLowerCase (tch)
	      != java::lang::Character::toLowerCase (och))
	  && (java::lang::Character::toUpperCase (tch)
	      != java::lang::Character::toUpperCase (och)))
	return false;
    }
  return true;
}
/******/

(We can take a look at:
http://subversions.gnu.org/cgi-bin/viewcvs/gcc/gcc/libjava/java/lang/natString.cc?rev=1.25.6.2&content-type=text/vnd.viewcvs-markup
for the complete source code.)

So, in GCJ's implementation of Java's String.equalsIgnoreCase(), it does a
toLowerCase(), letter by letter, rather than what we'd do in Python by
uppercasing the whole thing.  Hmmm... actually, I'm curious why they have
to compare both the lowercased and uppercased versions of each character
though...


Sorry, I get sidetracked a lot.  *grin* Back to Python: we can always
write a function to make things look nicer:

###
def cmpIgnoresCase(s1, s2):
    """Returns a negative value if s1 is smaller than s2, zero if the two
strings are equal, and a positive value if s1 is greater than s2, case
insensitively"""
    return cmp(s1.upper(), s2.upper())
###


Best of wishes!