String comparison question
Michael Spencer
mahs at telcopartners.com
Sun Mar 19 22:15:50 EST 2006
Olivier Langlois wrote:
> Hi Michael!
>
> Your suggestion is fantastic and is doing exactly what I was looking
> for! Thank you very much.
> There is something that I'm wondering though. Why is the solution you
> proposed wouldn't work with Unicode strings?
>
Simply, that str.translate with two arguments isn't implemented for unicode
strings. I don't know the underlying reason, or how hard it would be to change.
If you do need the comparison functionality for unicode strings, you'll have
to go with a different approach. For example, using regular expressions:
import re
def compare2(a, b):
"""Compare two basestrings, disregarding whitespace -> bool"""
return re.sub("\s*", "", a) == re.sub("\s*", "", b)
This is slower than the str.translate approach, though it has the advantage that
you could easily modify it to normalize, rather than eliminate whitespace. This
would be a more useful comparison in many cases.
def compare3(a, b):
"""Compare two basestrings, normalizing whitespace -> bool"""
return re.sub("\s*", " ", a) == re.sub("\s*", " ", b)
Continuing the disclaimers: none these approaches makes any attempt to deal
specially with quoted whitespace or any other sort of escapes.
Michael
More information about the Python-list
mailing list