On Sat, Jun 30, 2012 at 12:03 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
As shown in issue #15016 [1], there is a use cases when it is useful to determine that string can be encoded in ASCII or Latin1. In working with Tk or Windows console applications can be useful to determine that string can be encoded in UCS2. C API provides interface for this, but at Python level it is not available.
I propose to add to strings class new methods: isascii(), islatin1() and isbmp() (in addition to such methods as isalpha() or isdigit()). The implementation will be trivial.
Pro: The current trick with trying to encode has O(n) complexity and has overhead of exception raising/catching.
Contra: In most cases after determining characters range we still need to encode a string with the appropriate encoding. New methods will complicate already overloaded strings class.
Objections?
-1 It doesn't make sense to special case them, instead of a simpler canencode() method added. It could save memory, but I don't see it saving time.
[1] http://bugs.python.org/issue15016
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy