[Python-ideas] isascii()/islatin1()/isbmp()
Serhiy Storchaka
storchaka at gmail.com
Sat Jun 30 18:03:10 CEST 2012
As shown in issue #15016 [1], there is a use cases when it is useful to
determine that string can be encoded in ASCII or Latin1. In working with
Tk or Windows console applications can be useful to determine that
string can be encoded in UCS2. C API provides interface for this, but at
Python level it is not available.
I propose to add to strings class new methods: isascii(), islatin1() and
isbmp() (in addition to such methods as isalpha() or isdigit()). The
implementation will be trivial.
Pro: The current trick with trying to encode has O(n) complexity and has
overhead of exception raising/catching.
Contra: In most cases after determining characters range we still need
to encode a string with the appropriate encoding. New methods will
complicate already overloaded strings class.
Objections?
[1] http://bugs.python.org/issue15016
More information about the Python-ideas
mailing list