Different bases format specification

On Sat, Dec 3, 2011 at 10:12 AM, T.B. <bauertomer@gmail.com> wrote:
I suggest using the precision field in the format specification for integers for that.
Supporting arbitrary bases for string formatting has been discussed and rejected in the past (both in the context of PEP 3101's introduction of new string formatting and on other occasions). Nobody has ever produced convincing use cases for natively supporting formatting with bases other than binary, octal, decimal and hexadecimal. Accordingly, those 4 are supported explicitly via the 'b', 'o', 'd' and 'x'/'X' formatting codes, while other formats still require an explicit conversion function. As for "Why Not?" 1. 'd' stands for decimal. If support for arbitrary bases were added, it would need to be as a separate format code (e.g. 'i' for integer) 2. The explicit 'b', 'o' and 'x' codes are related to integer literal notation (0b10, 0o777, 0x1F), not to the second argument to int() 3. The use cases just aren't that strong. When you start dealing with base36 and base64, you're not talking about formatting numbers for human readers any more, you're talking about encoding numbers as short pieces of text. Better to let people decide exactly the behaviour they want by coding it themselves. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2011-12-03 03:31, Nick Coghlan wrote:
For weird math scenarios I know there are already many modules and packages. But what about ternary? en.wikipedia.org/wiki/Ternary_numeral_system has some points that include that base 9 and 27 are used [no citation].
That one reason I wrote: "It might be a nice mnemonic using 'b' instead, standing for 'base'. Then the default base will be 2." Anyway, I think there should be 'B' and 'O' presentation types, that will be used for outputting '0B' and '0O' prefixes. Thanks for your reply, TB

On Sat, Dec 3, 2011 at 1:16 PM, T.B. <bauertomer@gmail.com> wrote:
Weird math scenarios are no justification for changing the behaviour of a builtin type. b/o/d/x/X cover all the common use cases, everything else can be handled by libraries (including providing custom string.Formatter subclasses). Code and functionality are not free - we need solid gains in easier (or otherwise improved) coding and maintenance for real world problems before we add more of either.
And the symmetry with the integer literal codes will still be lost.
Anyway, I think there should be 'B' and 'O' presentation types, that will be used for outputting '0B' and '0O' prefixes.
If you really want that (Why would you?) and so long as the numbers aren't negative: "0B{:b}".format(number) "0O{:o}".format(number) The only reason 'X' is provided for hexadecimal formatting is to capitalize the letters that appear within the number itself. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2011-12-03 06:31, Nick Coghlan wrote:
All is not lost. "{:b}".format(num) would still print numbers in binary notion. Only when using an optional field it will change the output. One far-reaching solution for the symmetry break is allowing ALL integer literal codes to have that optional field, and each literal code will have its own default base: b->2, o->8, d->10, x->16.
Anyway, I think there should be 'B' and 'O' presentation types, that will be used for outputting '0B' and '0O' prefixes.
Horrors such as an uppercase-only file/shell/whatever still exist. They are rare and you'll usually call upper() before sending a message/writing to a file to them, but it might worth adding 'B'.
0B111 and -0O755 are *current* valid python tokens. There is no "elegant" way of outputting those tokens. I suggested 'B' and 'O' as a side-effect, because bases > 10 have some digits as letters. regards, TB

On Sat, Dec 3, 2011 at 3:06 PM, T.B. <bauertomer@gmail.com> wrote:
You're setting your bar for "hey, let's change the way a builtin type works" *way* too low. It's *OK* if obscure corner cases like bases outside the main four (2, 8, 10, 16), or "upper case only" environments require extra code. "Make easy things easy and hard things possible" is the goal, not "build in complex features to handle special cases that only arise on rare occasions and can already be dealt with using the vast array of general purpose programming tools Python provides". It's not that supporting arbitrary bases is a terrible idea - it's that it doesn't come up often enough as a general programming problem to be worth going to the effort of making the change. There's no such thing as a "trivial" change to a Python builtin - they *all* have significant repercussions, as the update ripples out through the Python ecosystem over the course of several years (see http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, 3 Dec 2011 14:31:51 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
By the way, any reason why hex output represents negative number with a negative sign (instead of the more usual 2s-complement representation)? It's not too difficult to normalize by hand (e.g. add 2**32 if you know the number is a 32-bit one) but it always irks me that Python doesn't do it by default. I cannot think of a situation where the "sign" is relevant when printing a hex number: hex is about the raw binary representation of the number. Regards Antoine.

On Sat, Dec 3, 2011 at 8:45 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
This is because Python's integers are not limited to 32 bits or 64 bits. If you read PEP 237, you'll see that this was one of the hardest differences between ints and longs to be resolved. You'd have to include an infinite number of leading 'F' characters to format a negative long this way... -- --Guido van Rossum (python.org/~guido)

On Sun, Dec 4, 2011 at 3:07 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Random thought... could we use the integer precision field to fix *that*, by having it indicate the intended number of bytes in the integer? That is, currently:
What if instead that produced:
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Dec 3, 2011 at 5:13 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Usually that field is measured in characters/digits, so this should probably produce FFE1; you'd need {:.8X} to produce FFFFFFE1. This would then logically extend to binary and octal, in each case measuring characters/digits in the indicated base. -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 3, 2011 at 5:32 PM, MRAB <python@mrabarnett.plus.com> wrote:
OTOH I'm not sure what should happen if the number (negative or positive!) doesn't fit in the precision. How common is this use case? Personally I'm fine with writing x & (2**N - 1) where N is e.g. 32 or 64. -- --Guido van Rossum (python.org/~guido)

On 04/12/2011 01:51, Guido van Rossum wrote:
Well, the width is treated as the minimum width, so perhaps the precision should be the minimum precision in this case.
How common is this use case? Personally I'm fine with writing x & (2**N - 1) where N is e.g. 32 or 64.

On Sun, Dec 4, 2011 at 11:17 AM, Guido van Rossum <guido@python.org> wrote:
True, I guess it's just a matter of dividing the bit width by 2 for binary, 3 for octal (rounding up) and 4 for binary. (My brain was locked into bytes mode for some reason, so converting to character counts seemed overly complicated - of course, if you go directly from bits to characters, it's no more complicated than converting to a bytes count). ".4d" would still raise an exception, though - I don't know of any obvious way to make two's complement notation meaningful in base 10. For numbers that didn't fit in the specified precision, I'd also suggest raising ValueError. This would be tinkering with the behaviour of builtin, so I guess it would need a PEP? (I already have too many of those in train... although I did just tidy that up a bit by officially deferring consideration of the ImportEngine PEP until 3.4 at the earliest) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 04/12/2011 02:06, Nick Coghlan wrote:
It wouldn't be two's complement in that case, it would be ten's complement. (Ever used BCD arithmetic in assembly language? :-)) Actually, would it really be two's complement in octal or hexadecimal either? Wouldn't it be eight's complement and sixteen's complement respectively?

On Sun, Dec 4, 2011 at 12:14 PM, MRAB <python@mrabarnett.plus.com> wrote:
For hexadecimal, it doesn't make much difference, since the number of bits per digit is a power of 2: 2**32 == (2**4)**8 For octal, you're right, since it would be 2**33 == (2**3)**11 I quite like Terry's definition, which does extend cleanly to 'd' (as well as giving the expected answer for 'b', 'o', 'x' and 'X'): For integers, the precision field would be used to specify the expected maximum number of digits in the answer, and to switch the representation of negative values to the appropriate 'complement' form (e.g. two's complement for binary numbers). When a precision ".prec" is specified for an integer formatting code (b, o, d, x or X), the value to be displayed would be calculated as follows: _BASES = dict(b=2,o=8,d=10,x=16,X=16) _BASE_NAMES = dict(b='binary',o='octal',d='decimal',x='hexadecimal',X='hexadecimal') _base = _BASES[format_code] _prec_bound = _base ** prec _max_value = _prec_bound / 2 if value < -_max_value or value >= _max_value: _code = _BASE_NAMES[format_code] raise ValueError("Integer {} too large for {} precision of {}".format(value, _code, n)) _value = _prec_bound - value However, I'm not sure that qualifies as *useful* behaviour - while the bounds checking aspect could be useful for decimal, the complement form of negative numbers is almost never going to be what anyone wants. If we decide to improve things in terms of Python-level handling of two's complement arithmetic, perhaps it would make more sense to just provide a method on int objects that calculates the two's complement of a number for a given bit length? Something like: def as_twos_complement(self, bits): if self.bit_length() >= bits: raise ValueError("int {} too large for {}-bit signed precision".format(self, bits)) if self >= 0: return self return 2**bits + self # self is known to be negative at this point As for whether this is worth doing or not... I think so. While Python integers may be limited in size solely by available memory, it's going to be a fact of computing life for quite some time that there are going to be fixed size signed and unsigned integers under the hood *somewhere*. We already provide a mechanism to find out how many bits a given integer needs, this would be about providing a standard, efficient, mechanism to convert negative integers to their two's complement positive equivalents for a given number of bits. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 12/3/2011 9:06 PM, Nick Coghlan wrote:
This would be tinkering with the behaviour of builtin,
It is a backwards compatible augmentation: (as I understand the proposal) .n for ints means print negatives with n digits in baseNs-complement notation instead of raising an exception. -- Terry Jan Reedy

04.12.11 04:06, Nick Coghlan написав(ла):
".4d" would still raise an exception, though - I don't know of any obvious way to make two's complement notation meaningful in base 10.
Obviously, '{0:.4d}'.format(-31) == '{0:4d}'.format((-31)%10**4) == '9969'.
This would be tinkering with the behaviour of builtin, so I guess it would need a PEP?
No, no! Please, do not obfuscate Python formatting.

On Tue, Dec 6, 2011 at 3:15 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I agree the formatting approach is way too obscure, but did you see my later suggestion of an "as_twos_complement(bit_length)" conversion method on int objects? (we could actually provide a more generic version on numbers.Integer as well) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Dec 6, 2011 at 9:10 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Hmm, in the sense that the answer we're getting is the same answer you would get with a cast to an unsigned type at the C level? I think that's a little misleading - conceptually, the number is still signed, we're just representing it differently (i.e. explicitly using the twos complement form, rather than the the normal sign bit). I'd be OK with dropping the explicit 'twos' qualifier, though - then the method name could just be "to_complement()". I guess we'd also want a "to_signed()" to reverse the process: def to_complement(self, bits): "Convert this integer to its unsigned two's complement equivalent for the given bit length" if self.bit_length() >= bits: raise ValueError("{} is too large for {}-bit two's complement precision".format(self, bits)) if self >= 0: return self return 2**bits + self # self is known to be negative at this point def to_signed(self, bits): "Convert an integer in two's complement format to its signed equivalent for the given bit length" if self < 0: raise ValueError("{} is already signed".format(self)) if self.bit_length() > bits: raise ValueError("{} is too large for {}-bit two's complement precision".format(self, bits)) upper_bound = 2**bits if self < (upper_bound / 2): return self return upper_bound - self Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 5, 2011 at 4:14 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't think there's a better term available. As long as the return value of to_unsigned() is never negative I think it's a fine name.
Sure.
-- --Guido van Rossum (python.org/~guido)

On Tue, Dec 6, 2011 at 10:16 AM, Guido van Rossum <guido@python.org> wrote:
Recorded the RFE here (using "to_signed()/to_unsigned()"): http://bugs.python.org/issue13535 I was sold on the name when I read my own docstring: "Convert this integer to its unsigned two's complement equivalent for the given bit length" Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 5, 2011 at 4:14 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd be OK with dropping the explicit 'twos' qualifier, though - then the method name could just be "to_complement()".
That name's ambiguous since there's also one's complement as well as ten's and nine's complements. Also, perhaps from_twos_complement might be a clearer name than to_signed. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com

On Fri, Dec 2, 2011 at 4:12 PM, T.B. <bauertomer@gmail.com> wrote:
I don't think this belongs in format.
P.S. Bonus question: What "{:.-909d}".format(42) would print?
Any proposal which includes an inscrutable example doesn't bode well for the usability of the feature. :-) Sure, negative bases are mathematically meaningful but are they useful in Python? And why not complex bases then? Or did you have something else strange in mind? If there's enough need for encoding in different bases, including a standard version of format_integer_in_base makes a lot more sense. We could write format_integer_in_base(15, 16) to get "F" and format_integer_in_base(64, "A23456789TJQK") to get "4K". But note that standard base 64 encoding is not at all the same -- this function encodes starting at LSB while base 64 encodes at word boundaries. Finally, note that if you really want to mangle format strings you can do it without changing the library. Just write it this way "{:.16d}".format(arbitrarybase(31)) where you have defined class arbitrarybase: def __format__(self, format_spec): return format_integer_in_base(parse format spec etc.) --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com

Tip for self: No more HTML e-mails. On 2011-12-03 04:19, Bruce Leban wrote:
My intention will be clear after reading http://bugs.python.org/issue2844. It will also ruin the surprise of figuring out alone. Regards, TB

On Sat, Dec 3, 2011 at 10:12 AM, T.B. <bauertomer@gmail.com> wrote:
I suggest using the precision field in the format specification for integers for that.
Supporting arbitrary bases for string formatting has been discussed and rejected in the past (both in the context of PEP 3101's introduction of new string formatting and on other occasions). Nobody has ever produced convincing use cases for natively supporting formatting with bases other than binary, octal, decimal and hexadecimal. Accordingly, those 4 are supported explicitly via the 'b', 'o', 'd' and 'x'/'X' formatting codes, while other formats still require an explicit conversion function. As for "Why Not?" 1. 'd' stands for decimal. If support for arbitrary bases were added, it would need to be as a separate format code (e.g. 'i' for integer) 2. The explicit 'b', 'o' and 'x' codes are related to integer literal notation (0b10, 0o777, 0x1F), not to the second argument to int() 3. The use cases just aren't that strong. When you start dealing with base36 and base64, you're not talking about formatting numbers for human readers any more, you're talking about encoding numbers as short pieces of text. Better to let people decide exactly the behaviour they want by coding it themselves. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2011-12-03 03:31, Nick Coghlan wrote:
For weird math scenarios I know there are already many modules and packages. But what about ternary? en.wikipedia.org/wiki/Ternary_numeral_system has some points that include that base 9 and 27 are used [no citation].
That one reason I wrote: "It might be a nice mnemonic using 'b' instead, standing for 'base'. Then the default base will be 2." Anyway, I think there should be 'B' and 'O' presentation types, that will be used for outputting '0B' and '0O' prefixes. Thanks for your reply, TB

On Sat, Dec 3, 2011 at 1:16 PM, T.B. <bauertomer@gmail.com> wrote:
Weird math scenarios are no justification for changing the behaviour of a builtin type. b/o/d/x/X cover all the common use cases, everything else can be handled by libraries (including providing custom string.Formatter subclasses). Code and functionality are not free - we need solid gains in easier (or otherwise improved) coding and maintenance for real world problems before we add more of either.
And the symmetry with the integer literal codes will still be lost.
Anyway, I think there should be 'B' and 'O' presentation types, that will be used for outputting '0B' and '0O' prefixes.
If you really want that (Why would you?) and so long as the numbers aren't negative: "0B{:b}".format(number) "0O{:o}".format(number) The only reason 'X' is provided for hexadecimal formatting is to capitalize the letters that appear within the number itself. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 2011-12-03 06:31, Nick Coghlan wrote:
All is not lost. "{:b}".format(num) would still print numbers in binary notion. Only when using an optional field it will change the output. One far-reaching solution for the symmetry break is allowing ALL integer literal codes to have that optional field, and each literal code will have its own default base: b->2, o->8, d->10, x->16.
Anyway, I think there should be 'B' and 'O' presentation types, that will be used for outputting '0B' and '0O' prefixes.
Horrors such as an uppercase-only file/shell/whatever still exist. They are rare and you'll usually call upper() before sending a message/writing to a file to them, but it might worth adding 'B'.
0B111 and -0O755 are *current* valid python tokens. There is no "elegant" way of outputting those tokens. I suggested 'B' and 'O' as a side-effect, because bases > 10 have some digits as letters. regards, TB

On Sat, Dec 3, 2011 at 3:06 PM, T.B. <bauertomer@gmail.com> wrote:
You're setting your bar for "hey, let's change the way a builtin type works" *way* too low. It's *OK* if obscure corner cases like bases outside the main four (2, 8, 10, 16), or "upper case only" environments require extra code. "Make easy things easy and hard things possible" is the goal, not "build in complex features to handle special cases that only arise on rare occasions and can already be dealt with using the vast array of general purpose programming tools Python provides". It's not that supporting arbitrary bases is a terrible idea - it's that it doesn't come up often enough as a general programming problem to be worth going to the effort of making the change. There's no such thing as a "trivial" change to a Python builtin - they *all* have significant repercussions, as the update ripples out through the Python ecosystem over the course of several years (see http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, 3 Dec 2011 14:31:51 +1000 Nick Coghlan <ncoghlan@gmail.com> wrote:
By the way, any reason why hex output represents negative number with a negative sign (instead of the more usual 2s-complement representation)? It's not too difficult to normalize by hand (e.g. add 2**32 if you know the number is a 32-bit one) but it always irks me that Python doesn't do it by default. I cannot think of a situation where the "sign" is relevant when printing a hex number: hex is about the raw binary representation of the number. Regards Antoine.

On Sat, Dec 3, 2011 at 8:45 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
This is because Python's integers are not limited to 32 bits or 64 bits. If you read PEP 237, you'll see that this was one of the hardest differences between ints and longs to be resolved. You'd have to include an infinite number of leading 'F' characters to format a negative long this way... -- --Guido van Rossum (python.org/~guido)

On Sun, Dec 4, 2011 at 3:07 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Random thought... could we use the integer precision field to fix *that*, by having it indicate the intended number of bytes in the integer? That is, currently:
What if instead that produced:
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sat, Dec 3, 2011 at 5:13 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Usually that field is measured in characters/digits, so this should probably produce FFE1; you'd need {:.8X} to produce FFFFFFE1. This would then logically extend to binary and octal, in each case measuring characters/digits in the indicated base. -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 3, 2011 at 5:32 PM, MRAB <python@mrabarnett.plus.com> wrote:
OTOH I'm not sure what should happen if the number (negative or positive!) doesn't fit in the precision. How common is this use case? Personally I'm fine with writing x & (2**N - 1) where N is e.g. 32 or 64. -- --Guido van Rossum (python.org/~guido)

On 04/12/2011 01:51, Guido van Rossum wrote:
Well, the width is treated as the minimum width, so perhaps the precision should be the minimum precision in this case.
How common is this use case? Personally I'm fine with writing x & (2**N - 1) where N is e.g. 32 or 64.

On Sun, Dec 4, 2011 at 11:17 AM, Guido van Rossum <guido@python.org> wrote:
True, I guess it's just a matter of dividing the bit width by 2 for binary, 3 for octal (rounding up) and 4 for binary. (My brain was locked into bytes mode for some reason, so converting to character counts seemed overly complicated - of course, if you go directly from bits to characters, it's no more complicated than converting to a bytes count). ".4d" would still raise an exception, though - I don't know of any obvious way to make two's complement notation meaningful in base 10. For numbers that didn't fit in the specified precision, I'd also suggest raising ValueError. This would be tinkering with the behaviour of builtin, so I guess it would need a PEP? (I already have too many of those in train... although I did just tidy that up a bit by officially deferring consideration of the ImportEngine PEP until 3.4 at the earliest) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 04/12/2011 02:06, Nick Coghlan wrote:
It wouldn't be two's complement in that case, it would be ten's complement. (Ever used BCD arithmetic in assembly language? :-)) Actually, would it really be two's complement in octal or hexadecimal either? Wouldn't it be eight's complement and sixteen's complement respectively?

On Sun, Dec 4, 2011 at 12:14 PM, MRAB <python@mrabarnett.plus.com> wrote:
For hexadecimal, it doesn't make much difference, since the number of bits per digit is a power of 2: 2**32 == (2**4)**8 For octal, you're right, since it would be 2**33 == (2**3)**11 I quite like Terry's definition, which does extend cleanly to 'd' (as well as giving the expected answer for 'b', 'o', 'x' and 'X'): For integers, the precision field would be used to specify the expected maximum number of digits in the answer, and to switch the representation of negative values to the appropriate 'complement' form (e.g. two's complement for binary numbers). When a precision ".prec" is specified for an integer formatting code (b, o, d, x or X), the value to be displayed would be calculated as follows: _BASES = dict(b=2,o=8,d=10,x=16,X=16) _BASE_NAMES = dict(b='binary',o='octal',d='decimal',x='hexadecimal',X='hexadecimal') _base = _BASES[format_code] _prec_bound = _base ** prec _max_value = _prec_bound / 2 if value < -_max_value or value >= _max_value: _code = _BASE_NAMES[format_code] raise ValueError("Integer {} too large for {} precision of {}".format(value, _code, n)) _value = _prec_bound - value However, I'm not sure that qualifies as *useful* behaviour - while the bounds checking aspect could be useful for decimal, the complement form of negative numbers is almost never going to be what anyone wants. If we decide to improve things in terms of Python-level handling of two's complement arithmetic, perhaps it would make more sense to just provide a method on int objects that calculates the two's complement of a number for a given bit length? Something like: def as_twos_complement(self, bits): if self.bit_length() >= bits: raise ValueError("int {} too large for {}-bit signed precision".format(self, bits)) if self >= 0: return self return 2**bits + self # self is known to be negative at this point As for whether this is worth doing or not... I think so. While Python integers may be limited in size solely by available memory, it's going to be a fact of computing life for quite some time that there are going to be fixed size signed and unsigned integers under the hood *somewhere*. We already provide a mechanism to find out how many bits a given integer needs, this would be about providing a standard, efficient, mechanism to convert negative integers to their two's complement positive equivalents for a given number of bits. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 12/3/2011 9:06 PM, Nick Coghlan wrote:
This would be tinkering with the behaviour of builtin,
It is a backwards compatible augmentation: (as I understand the proposal) .n for ints means print negatives with n digits in baseNs-complement notation instead of raising an exception. -- Terry Jan Reedy

04.12.11 04:06, Nick Coghlan написав(ла):
".4d" would still raise an exception, though - I don't know of any obvious way to make two's complement notation meaningful in base 10.
Obviously, '{0:.4d}'.format(-31) == '{0:4d}'.format((-31)%10**4) == '9969'.
This would be tinkering with the behaviour of builtin, so I guess it would need a PEP?
No, no! Please, do not obfuscate Python formatting.

On Tue, Dec 6, 2011 at 3:15 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
I agree the formatting approach is way too obscure, but did you see my later suggestion of an "as_twos_complement(bit_length)" conversion method on int objects? (we could actually provide a more generic version on numbers.Integer as well) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Dec 6, 2011 at 9:10 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
Hmm, in the sense that the answer we're getting is the same answer you would get with a cast to an unsigned type at the C level? I think that's a little misleading - conceptually, the number is still signed, we're just representing it differently (i.e. explicitly using the twos complement form, rather than the the normal sign bit). I'd be OK with dropping the explicit 'twos' qualifier, though - then the method name could just be "to_complement()". I guess we'd also want a "to_signed()" to reverse the process: def to_complement(self, bits): "Convert this integer to its unsigned two's complement equivalent for the given bit length" if self.bit_length() >= bits: raise ValueError("{} is too large for {}-bit two's complement precision".format(self, bits)) if self >= 0: return self return 2**bits + self # self is known to be negative at this point def to_signed(self, bits): "Convert an integer in two's complement format to its signed equivalent for the given bit length" if self < 0: raise ValueError("{} is already signed".format(self)) if self.bit_length() > bits: raise ValueError("{} is too large for {}-bit two's complement precision".format(self, bits)) upper_bound = 2**bits if self < (upper_bound / 2): return self return upper_bound - self Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 5, 2011 at 4:14 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I don't think there's a better term available. As long as the return value of to_unsigned() is never negative I think it's a fine name.
Sure.
-- --Guido van Rossum (python.org/~guido)

On Tue, Dec 6, 2011 at 10:16 AM, Guido van Rossum <guido@python.org> wrote:
Recorded the RFE here (using "to_signed()/to_unsigned()"): http://bugs.python.org/issue13535 I was sold on the name when I read my own docstring: "Convert this integer to its unsigned two's complement equivalent for the given bit length" Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Mon, Dec 5, 2011 at 4:14 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
I'd be OK with dropping the explicit 'twos' qualifier, though - then the method name could just be "to_complement()".
That name's ambiguous since there's also one's complement as well as ten's and nine's complements. Also, perhaps from_twos_complement might be a clearer name than to_signed. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com

On Fri, Dec 2, 2011 at 4:12 PM, T.B. <bauertomer@gmail.com> wrote:
I don't think this belongs in format.
P.S. Bonus question: What "{:.-909d}".format(42) would print?
Any proposal which includes an inscrutable example doesn't bode well for the usability of the feature. :-) Sure, negative bases are mathematically meaningful but are they useful in Python? And why not complex bases then? Or did you have something else strange in mind? If there's enough need for encoding in different bases, including a standard version of format_integer_in_base makes a lot more sense. We could write format_integer_in_base(15, 16) to get "F" and format_integer_in_base(64, "A23456789TJQK") to get "4K". But note that standard base 64 encoding is not at all the same -- this function encodes starting at LSB while base 64 encodes at word boundaries. Finally, note that if you really want to mangle format strings you can do it without changing the library. Just write it this way "{:.16d}".format(arbitrarybase(31)) where you have defined class arbitrarybase: def __format__(self, format_spec): return format_integer_in_base(parse format spec etc.) --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com

Tip for self: No more HTML e-mails. On 2011-12-03 04:19, Bruce Leban wrote:
My intention will be clear after reading http://bugs.python.org/issue2844. It will also ruin the surprise of figuring out alone. Regards, TB
participants (8)
-
Antoine Pitrou
-
Bruce Leban
-
Guido van Rossum
-
MRAB
-
Nick Coghlan
-
Serhiy Storchaka
-
T.B.
-
Terry Reedy