[Python-Dev] PEP 414 - Unicode Literals for Python 3

Vinay Sajip vinay_sajip at yahoo.co.uk
Tue Feb 28 14:30:48 CET 2012


 <martin <at> v.loewis.de> writes:

> 
> > A couple of people have said that 'native string' is spelt 'str', but I'm not
> > sure that's the right answer. For example, 2.x's cString.StringIO  
> > expects native strings, not Unicode:
> 
> Your counter-example is non-ASCII characters/bytes. I doubt that this  
> is a valid
> use case; in a "native" string, these shouldn't occur (i.e. native  
> strings should
> always be ASCII), since the semantics of non-ASCII changes drastically between
> 2.x and 3.x. So whoever defines some API to take "native" strings  
> can't have defined
> a valid use of non-ASCII in that interface.

It might not be a valid usage, but the 2.x ecosystem has numerous occurrences of
invalid usages, which tend to crop up when porting because of 3.x's increased
strictness.

In the example I gave, cStringIO.StringIO should be able to cope with text
strings, but doesn't. Of course there are StringIO.StringIO and io.StringIO in
2.6, but when porting a project, you can't be sure which of these you might run
into.

> Indeed it should. If there is a known application of non-ASCII native strings,
> I surely would like to know what that is.

I can't think of a specific instance off-hand, but I seem to recall having
problems with some of the cookie APIs insisting on native strings (rather than
text, which is validated against ASCII where appropriate).

Regards,

Vinay Sajip



More information about the Python-Dev mailing list