[Python-ideas] isascii()/islatin1()/isbmp()

Mon Jul 2 10:38:19 CEST 2012

On Sun, Jul 01, 2012 at 01:48:14AM -0400, Terry Reedy wrote:

> >As for as expressibility goes, it is not much of an advantage. But:
> >
> >- if there are optimizations that apply to some encodings but not others,
> >   the encodable method can take advantage of them without it being a
> >   promise of the language;
> 
> It would be an optimization limited to a couple of encodings with 
> CPython. Using it for cross-version code would be something like the 
> trap of depending on the CPython optimization of repeated string 
> concatenation.

I'd hardly call it a trap. It's not like string concatenation which is 
expected to be O(N**2) on CPython but occasionally falls back to 
O(N**2). It would be O(N) expected on all platforms, but occasionally 
does better.

Perhaps an anti-trap -- sometimes it does better than expected, rather 
than worse.

> >- it only adds a single string method (and presumably a single bytes
> >   method, decodable) rather than a plethora of methods;
> 
> Decodable would always require a scan of the bytes. Might as well just 
> decode and look for UnicodeDecodeError.

*shrug* Perhaps so. bytes.decodable() would only be a LBYL convenience 
method.

> >So, I don't care much either way for a LBYL test, but if there is a good
> >use case for such a test,
> 
> My claim is that there is only a good use case if it is O(1), which 
> would only be a few cases on CPython.

*shrug* Again, I'm not exactly championing this proposal. I can see that 
an encodable method would be useful, but not that much more useful than 
trying to encode and catching the exception. A naive O(N) version of 
encodable() is trivial to implement.

-- 
Steven