On 01/17/2014 05:27 PM, Steven D'Aprano wrote:
On Fri, Jan 17, 2014 at 08:49:21AM -0800, Ethan Furman wrote:
Overriding Principles =====================
In order to avoid the problems of auto-conversion and Unicode exceptions that could plague Py2 code, all object checking will be done by duck-typing, not by values contained in a Unicode representation [3]_.
I don't understand this paragraph. What does "values contained in a Unicode representation" mean?
Yeah, that is clunky. I'm trying to convey the idea that we don't want errors based on content, i.e. which characters happens to be in a str.
[...]
%s is restricted in what it will accept::
- input type supports Py_buffer? use it to collect the necessary bytes
Can you give some examples of what types support Py_buffer? Presumably bytes. Anything else?
Anybody? Otherwise I'll go spelunking in the code.
- input type is something else? use its __bytes__ method; if there isn't one, raise a TypeError
I think you should explicitly state that this is a new special method, and state which built-in types will grow a __bytes__ method (if any).
It's not new. I know bytes, str, and numbers /do not/ have __bytes__.
Numeric Format Codes --------------------
To properly handle int and float subclasses, int(), index(), and float() will be called on the objects intended for (d, i, u), (b, o, x, X), and (e, E, f, F, g, G).
-1 on this idea.
This is a rather large violation of the principle of least surprise, and radically different from the behaviour of Python 3 str. In Python 3, '%d' interpolation calls the __str__ method, so if you subclass, you can get the behaviour you want:
Did you read the bug reports I linked to? This behavior (which is a bug) has already been fixed for Python3.4. As a quick thought experiment, why does "%d" % True return "1"?
Unsupported codes -----------------
%r (which calls __repr__), and %a (which calls ascii() on __repr__) are not supported.
+1 on not supporting b'%r' (i.e. I agree with the PEP).
Why not support b'%a'? That seems to be a strange thing to prohibit.
I'll admit to being somewhat on the fence about %a. It seems there are two possibilities with %a: 1) have it be ascii(repr(obj)) 2) have it be str(obj).encode('ascii', 'strict') (1) seems only useful for debugging, but even then not very -- if you switch from %s to %a you'll no longer see the bytes output (although you would get the name of the object, which could be handy); (2) is (slightly) blurring the lines between text and encoded-ascii; I would rather see "%s" % text.encode('ascii', 'strict')" So we have two possibilities, both can be useful, I don't know which is most useful or even most logical. So I guess I'm still open to arguments. :)
Everythng else, well done and thank you.
You're welcome! Thank you to everyone who participated. -- ~Ethan~