
Eric Smith wrote:
Guido van Rossum wrote:
For data types whose output uses only ASCII, would it be acceptable if they always returned an 8-bit string and left it up to the caller to convert it to Unicode? This would apply to all numeric types. (The date/time types have a strftime() style API which means the user must be able to specifiy Unicode.)
I'm finally getting around to finishing this up. The approach I've taken for int, long, and float, is that they take either unicode or str format specifiers, and always return str results. The builtin format() deals with converting str to unicode, if the format specifier was originally unicode. This all works great. It allows me to easily implement both ''.format and u''.format taking int, long, and float parameters.
I'm now working on datetime. The __format__ method is really just a wrapper around strftime. I was assuming (or rather hoping) that strftime does the right thing with unicode and str (unicode in = unicode out, str in = str out). But it turns out strftime doesn't accept unicode:
$ ./python Python 2.6a0 (trunk:60845M, Feb 15 2008, 21:09:57) [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import datetime datetime.date.today().strftime('%y')
'08'
datetime.date.today().strftime(u'%y')
Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: strftime() argument 1 must be str, not unicode
As part of this task, I'm really not up to the job of changing strftime to support both str and unicode inputs. So I think I'll put all of the __format__ code in place to support it if and when strftime supports unicode. In the meantime, it won't be possible for u''.format to work with datetime objects.
'year: {0:%y}'.format(datetime.date.today())
'year: 08'
u'year: {0:%y}'.format(datetime.date.today())
Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: strftime() argument 1 must be str, not unicode
The bad error message is a result of __format__ passing on unicode to strftime.
There are, of course, various ugly ways to work around this involving nested format calls.
Maybe I'll extend strftime to unicode for the PyCon sprint.
Eric.