[Python-ideas] Adding str.isascii() ?

Steven D'Aprano steve at pearwood.info
Tue Jan 30 03:08:53 EST 2018


On Mon, Jan 29, 2018 at 12:54:41PM -0800, Chris Barker wrote:

> I'm confused -- isn't the way to do this to encode your text into the
> encoding the other application accepts ?

Its more about warning the user of *my* application that the data 
they're exporting could generate mojibake, or even fail, in the other 
application.


> if you really want to know in advance, it is so hard to run it through a
> encode/decode sandwich?

See Nick's answer.


> Wait -- I can't find UCS-2 in the built-in encodings -- am I dense or is it
> not there? Shouldn't it be? If only for this reason?

Strictly speaking, UCS-2 is an obsolute standard more or less equivalent 
to UTF-16, except it doesn't support "astral characters" encoded by a 
pair of supplementary code points.

However, in practice, some languages' nominal UTF-16 handling is less 
than 100% conformant, in that they treat a surrogate pair as two 
undefined characters of one code point each, instead of a single defined 
character of two code points.

So I guess I'm using UCS-2 in an informal sense of "like UTF-16, without 
the astral characters". I'm not asking for an explicit UCS-2 codec.


-- 
Steve


More information about the Python-ideas mailing list