Sort out formatting differences in decimal and float
Hi, I'm in the process of implementing formatting in my C-decimal module. Since testing is quite time consuming, I would appreciate a binding decision on how the following should be formatted:
from decimal import * format(float(1234), '020,g') '0,000,000,000,001,234' format(float(1234), '0=20,g') '0,000,000,000,001,234'
format(Decimal(1234), '020,g') '0,000,000,000,001,234' format(Decimal(1234), '0=20,g') '0000000000000001,234'
format(Decimal('nan'), '020,g') ' NaN' format(Decimal('nan'), '0=20,g') '00000000000000000NaN'
You can see that float literally follows PEP-3101: "If the width field is preceded by a zero ('0') character, this enables zero-padding. This is equivalent to an alignment type of '=' and a fill character of '0'." The advantage of decimal is that the user has the option to suppress commas. The behaviour of float is slightly easier to implement in C. So the options are: a) 020 is always equivalent to 0=20 b) 020 is not always equivalent to 0=20 Stefan Krah
On Sat, Dec 5, 2009 at 11:18 AM, Stefan Krah
Hi,
I'm in the process of implementing formatting in my C-decimal module. Since testing is quite time consuming, I would appreciate a binding decision on how the following should be formatted:
from decimal import * format(float(1234), '020,g') '0,000,000,000,001,234' format(float(1234), '0=20,g') '0,000,000,000,001,234'
format(Decimal(1234), '020,g') '0,000,000,000,001,234' format(Decimal(1234), '0=20,g') '0000000000000001,234'
I like the Decimal behaviour better, but then I might be a little biased. :-) I find it odd that, for float formatting, the choice of fill character affects whether commas are inserted: Python 3.2a0 (py3k:76671, Dec 4 2009, 18:55:54) [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin Type "help", "copyright", "credits" or "license" for more information.
format(float(1234), '0=20,g') '0,000,000,000,001,234' format(float(1234), '1=20,g') '1111111111111111,234' format(float(1234), 'X=20,g') 'XXXXXXXXXXXXXXX1,234'
Mark
Mark Dickinson wrote:
I find it odd that, for float formatting, the choice of fill character affects whether commas are inserted:
Python 3.2a0 (py3k:76671, Dec 4 2009, 18:55:54) [GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin Type "help", "copyright", "credits" or "license" for more information.
format(float(1234), '0=20,g') '0,000,000,000,001,234' format(float(1234), '1=20,g') '1111111111111111,234' format(float(1234), 'X=20,g') 'XXXXXXXXXXXXXXX1,234'
I haven't spent a lot of time looking at it, but this might be related to http://bugs.python.org/issue6902 . I'll move this up in my queue. Eric.
Sorry for being a curmudgeon, however... >>> format(Decimal(1234), '020,g') '0,000,000,000,001,234' >>> format(Decimal(1234), '0=20,g') '0000000000000001,234' Why in the world would you ever want to insert commas as separators and not use them consistently? >>> format(Decimal('nan'), '020,g') ' NaN' >>> format(Decimal('nan'), '0=20,g') '00000000000000000NaN' Why in the world would you ever want to zero pad Nan (or Inf, for that matter)? Stefan> The advantage of decimal is that the user has the option to Stefan> suppress commas. The behaviour of float is slightly easier to Stefan> implement in C. Why? If the user asked for them why would you want to suppress (some of) them? Skip
Sorry for being a curmudgeon, however...
>>> format(Decimal(1234), '020,g') '0,000,000,000,001,234' >>> format(Decimal(1234), '0=20,g') '0000000000000001,234'
Why in the world would you ever want to insert commas as separators and not use them consistently?
>>> format(Decimal('nan'), '020,g') ' NaN' >>> format(Decimal('nan'), '0=20,g') '00000000000000000NaN'
Why in the world would you ever want to zero pad Nan (or Inf, for that matter)?
Because you didn't know in advance that the number ending up in your format call was a nan (or inf)? Cheers, Daniel
Stefan> The advantage of decimal is that the user has the option to Stefan> suppress commas. The behaviour of float is slightly easier to Stefan> implement in C.
Why? If the user asked for them why would you want to suppress (some of) them?
-- Psss, psss, put it down! - http://www.cafepress.com/putitdown
On Sat, Dec 5, 2009 at 1:18 PM,
>>> format(Decimal(1234), '020,g') '0,000,000,000,001,234' >>> format(Decimal(1234), '0=20,g') '0000000000000001,234'
Why in the world would you ever want to insert commas as separators and not use them consistently?
So should commas be inserted for any fill character at all? Or should only '0' be special-cased? What about other digits used as fill characters? What about other non-ascii zeros? Should those be special-cased too? I'm reluctant to add extra special cases and complication to what's already quite a complicated specification; it seems better to keep it simple (well, as simple as possible) and orthogonal. There's already a good way to ask for zero padding, by using the leading zero, as in '020,g'. Why would you use '0=20,g' instead? I'm not sure that the 'X=...' notation was intended to be used for zero padding. Mark
Mark> So should commas be inserted for any fill character at all? While I'm at it, it makes no sense to me to pad numbers with anything other than whitespace or zero. (BTW: I have never liked the new format() stuff, so I will be sticking with %-formatting as long as it exists in Python. My apologies if I don't understand some amazing generality about format(), but if you do dumb stuff like ask for comma separation of a number then ask to pad it with '*' you get what you deserve.) Mark> There's already a good way to ask for zero padding, by using the Mark> leading zero, as in '020,g'. Why would you use '0=20,g' instead? Note to the implementers: '0=20,g' has no mnemonic significance as far as I can tell. I thought it was my mail program failing to properly decode a bit of quoted printable text. Skip
skip@pobox.com wrote:
My apologies if I don't understand some amazing generality about format()
format("Header", '=^20s') '=======Header======='
"Format a single object {0!r}, multiple times in a single string. Year: {0.year}; Month: {0.month}; Day: {0.day}; Formatted: {0:%Y-%M-%d}".format(datetime.now()) 'Format a single object datetime.datetime(2009, 12, 6, 9, 16, 0, 875018), multiple times in a single string. Year: 2009; Month: 12; Day: 6; Formatted: 2009-16-06'
"Use keyword arguments easily: {x}, {y}, {z}".format(x=1, y=2, z=3) 'Use keyword arguments easily: 1, 2, 3'
For the things that mod formatting already allows you to do, our aim is to get format() functionality at least on par with what mod formatting supports (it should be most of the way there with the number formatting cleanups for 2.7/3.2). For the rest of the features (explicit position references, centre alignment, arbitrary fill characters, attribute and subscript references, type defined formatting control), mod formatting isn't even in the game. Getting rid of the magic behaviour associated with the use of tuples on the right hand side is also valuable:
"%s" % (1, 2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: not all arguments converted during string formatting "{}".format((1, 2)) '(1, 2)'
Developers that are already used to the limitations of mod formatting are expected to take some time to decide if they care about the extra features offered by the format method, but for new developers it should be an easy choice. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
skip@pobox.com wrote:
My apologies if I don't understand some amazing generality about format()
format("Header", '=^20s') '=======Header======='
"Format a single object {0!r}, multiple times in a single string. Year: {0.year}; Month: {0.month}; Day: {0.day}; Formatted: {0:%Y-%M-%d}".format(datetime.now()) 'Format a single object datetime.datetime(2009, 12, 6, 9, 16, 0, 875018), multiple times in a single string. Year: 2009; Month: 12; Day: 6; Formatted: 2009-16-06'
"Use keyword arguments easily: {x}, {y}, {z}".format(x=1, y=2, z=3) 'Use keyword arguments easily: 1, 2, 3'
For the things that mod formatting already allows you to do, our aim is to get format() functionality at least on par with what mod formatting supports (it should be most of the way there with the number formatting cleanups for 2.7/3.2). For the rest of the features (explicit position references, centre alignment, arbitrary fill characters, attribute and subscript references, type defined formatting control), mod formatting isn't even in the game.
Getting rid of the magic behaviour associated with the use of tuples on the right hand side is also valuable:
"%s" % (1, 2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: not all arguments converted during string formatting "{}".format((1, 2)) '(1, 2)'
A nice demonstration of what an excellent piece of work the new .format is or is becoming. I would still like it to be a goal for 3.2 that all stdlib modules that work with formats accept the new formats and not just % formats. Mark Summerfield only covered .format in his book on Python 3 programimg and I hated to tell him that there was at least one module in the stdlib that currently (3.1) requires the old style. Terry Jan Reedy
Terry Reedy wrote:
A nice demonstration of what an excellent piece of work the new .format is or is becoming. I would still like it to be a goal for 3.2 that all stdlib modules that work with formats accept the new formats and not just % formats.
Mark Summerfield only covered .format in his book on Python 3 programimg and I hated to tell him that there was at least one module in the stdlib that currently (3.1) requires the old style.
Yes, we do need to do that. It would be nice if we could come up with a cleaner solution than a proliferation of parallel APIs everywhere though :( Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
participants (7)
-
Daniel Fetchinson
-
Eric Smith
-
Mark Dickinson
-
Nick Coghlan
-
skip@pobox.com
-
Stefan Krah
-
Terry Reedy