Eric Smith wrote:
(I'm posting to python-dev, because this isn't strictly 3.0 related. Hopefully most people read it in addition to python-3000).
I'm working on backporting the changes I made for PEP 3101 (Advanced String Formatting) to the trunk, in order to meet the pre-PyCon release date for 2.6a1.
I have a few questions about how I should handle str/unicode. 3.0 was pretty easy, because everything was unicode.
1: How should the builtin format() work? It takes 2 parameters, an object o and a string s, and returns o.__format__(s). If s is None, it returns o.__format__(empty_string). In 3.0, the empty string is of course unicode. For 2.6, should I use u'' or ''?
I just re-read PEP 3101, and it doesn't mention this behavior with None. The way the code actually works is that the specifier is optional, and if it isn't present then it defaults to an empty string. This behavior isn't mentioned in the PEP, either.
This feature came from a request from Talin. We should either add this to the PEP (and docs), or remove it. If we document it, it should mention the 2.x behavior (as other places in the PEP do). If we removed it, it would remove the one place in the backport that's not just hard, but ambiguous. I'd just as soon see the feature go away, myself.
3: Every overridden __format__() method is going to have to check for string or unicode, just like object.__format() does, and return either a string or unicode object, appropriately. I don't see any way around this, but I'd like to hear any thoughts. I guess there aren't all that many __format__ methods that will be implemented, so this might not be a big burden. I'll of course implement the built in ones.
The PEP actually mentions that this is how 2.x will have to work. So I'll go ahead and implement it that way, on the assumption that getting string support into 2.6 is desirable.