[Python-Dev] Normalizing unicode?
mwh at python.net
Thu Dec 11 05:34:37 EST 2003
Edward Loper <edloper at gradient.cis.upenn.edu> writes:
> Scott David Daniels wrote:
>> I naïvely wrote:
>> >Could we perhaps use a comparison that, in effect, did:
>> > def uni_equal(first, second):
>> > if first == second:
>> > return True
>> > return first.normalize() == second.normalize()
>> >That is, take advantage of the fact that normalization is often
>> >unnecessary for "trivial" reasons.
> Before we start considering how it's possible to make
> unicode.__equal__ act encoding-insensitively, I think we need to
> consider whether that's really the behavior we want. In some ways,
> this seems like case-insensitive equality to me: it's certainly a
> useful operation, but I don't think it should be the object's builtin
> notion of equality..
> - I think people will be confused if s1==s2 but s1!=s2.
> - Sometimes you might *want* to distinguish different encodings of
> the "same" string; a "normalized" equality test makes that very
In general it seems to me that == should, given a choice, err on the
side of being an overly tight equivalence relation -- i.e. return True
81. In computing, turning the obvious into the useful is a living
definition of the word "frustration".
-- Alan Perlis, http://www.cs.yale.edu/homes/perlis-alan/quotes.html
More information about the Python-Dev