data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
While on the subject of RDBMS systems, a common need is to be able to work with fixed-decimal data. I think a standard Python fixed-decimal type would help to make Python database interfaces alot more robust. I even wonder if the Python long type might be hijacked for this purpose by adding a "scale" that indicates the number of digits to the right of the decimal point. For example, an expression like: 1000000000.2500L would create a fixed decimal number with a scale of 4. People have built Python classes for fixed-decimal types, but when working with RDBMS data, one often deals with lots of data and efficiency matters. I also suspect that adding scale to longs wouldn't be that hard and would be a fairly natural extension. In any case, a "standard" (being in the standard library would be sufficient) fixed-decimal type would probably lead to better database interfaces that (at least more) properly handled fixed-decimal data. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
What would be scale of the product of two fixed-decimal numbers? E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are arguments for either. Same question for division (harder, I think). I like the idea of using the dd.ddL notation for this. I have no time to implement it but would not be unwilling to accept patches. They would have to be accompanied with a wet signature, see http://www.python.org/1.5/wetsign.html. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
I'd be inclined to start by doing some research to see if some standard (SQL?) defines this somewhere. It would be nice if someone has already done the requirements work for us. :)
I like the idea of using the dd.ddL notation for this.
I have no time to implement
Me neither.
it but would not be unwilling to accept patches.
Cool. If no one else volunteers, then I'll try to find a way to get this done (not necessarily by me). I think it is pretty important.
They would have to be accompanied with a wet signature, see http://www.python.org/1.5/wetsign.html.
Yup. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Jim Fulton wrote:
Here is what the book "SQL-99 Complete, Really" says that the SQL standard says: - for addition and subtraction of two "exact" (fixed-decimal) numbers, the result has the maximum of the scales. - for multiplication of two "exact" (fixed-decimal) numbers, the result has the sum of the scales. - punts on division - for addition, subtraction, multiplication or division between "exact" (fixed point) and "approximate" (floating point) yields an approximate result. This means that fixed-decimal coerces to float. I'm curious to see who else chips in with examples from other systems. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
Jim Fulton wrote:
- for addition and subtraction of two "exact" (fixed-decimal) numbers, the result has the maximum of the scales.
One could argue that this is incorrect: if "3.1" means that I know the value to one decimal of precision, and "2.01" means that I know that value to two decimals of precision, stating the result of their sum as "5.11" suggests that I know the result to two decimals of precision, which is of course false: because I only knew one decimal of precision for one of the operands, I only know (at most!) one decimal of precision for the result. Not arguing for this interpretation, just indicating that doing fixed precision arithmetic right is hard. I'm waiting for Tim Peters' contribution, but he's on vacation so it may be a while. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Guido]
It's not so much hard as it is arbitrary. The floating-point world is standardized now, but the fixed-point world remains a mish-mash of incompatible legacy schemes carried across generations of products for no reason other than product-specific compatibility. So despite that fixed-point has a specialty audience, whatever rules Python chooses will leave it incompatible with much of that audience's (mixed!) expectations. If fixed-point is needed, and my FixedPoint.py isn't good enough (all other fixed point pkgs I've seen for Python were braindead), then it should be implemented such that developers can control both rounding and precision propagation. I'll attach suitable kernels; they haven't been tested but any bugs discovered will be trivial to fix (there are no difficulties here, but typos are likely); the kernels supply the bulk of what's required, whether implemented in Python or C; various packages can wrap them to supply whatever policies they like; see FixedPoint.py for exact string<->FixedPoint and exact float->FixedPoint conversions; and that's the end of my involvement in fixed-point <wink>. Python should certainly *not* add a "scale factor" to its current long implementation; fixed-point should be a distinct type, as scale-factor fiddling is clumsy and pervasive (long arithmetic is challenging enough to get correct and quick without this obfuscating distraction; and by leaving scale factors out of it, it's much easier to plug in alternative bigint implementations (like GMP)). One other point: some people are going to want BCD (binary-coded decimal), which suffers the same mish-mash of legacy policies, but with a different data representation. The point is that many commercial applications spend much more time doing I/O conversions than arithmetic, and BCD accepts slow arithmetic (in the absence of special HW support) in return for fast scaling & I/O conversion. Forgetting the database-heads for a moment, decimal *floating*-point is what calculators do, so that's what "real people" are most comfortable with. The IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that completely. Add a means to boost "global" precision (a la REXX), and it's a powerful tool even for experts (benefits approximating those of unbounded rational arithmetic but with bounded & user-controllable expense). can-never-have-too-many-numeric-types-but-always-have- too-few-literal-notations-ly y'rs - tim # Kernels for fixed-point decimal arithmetic. # _add, _sub, _mul, _div all have arglist # n1, p1, n2, p2, p, round=DEFAULT_ROUND # n1 and n2 are longs; p1, p2 and p ints >= 0. # The inputs are exactly n1/10**p1 and n2/10**p2. # # The return value is the integer n such that n/10**p is the best # approximation to the infinite-precision result. In other words, p1 # and p2 are the input precisions and p is the desired output # precision, where precision is the # of digits *after* the decimal # point. # # What "best approximation" means is determined by the round function. # In many cases rounding isn't required, but when it is # round(top, bot) # is returned. top and bot are longs, with bot > 0 guaranteed. The # infinite-precision result is top/bot. round must return an integer # (long) approximation to top/bot, using whichever rounding discipline # you want. By default, IEEE round-to-nearest/even is used; see the # _roundXXX functions for examples of suitable rounding functions. # # Note: The only code here that knows we're working in decimal is # function _tento; simply change the "10L" in that to do fixed-point # arithmetic in some other base. # # Example: # # >>> r7 = _div(1L, 0, 7L, 0, 20) # 1/7 # >>> r7 # 14285714285714285714L # >>> r5 = _div(1L, 0, 5L, 0, 20) # 1/5 # >>> r5 # 20000000000000000000L # >>> sum = _add(r7, 20, r5, 20, 20) # 1/7 + 1/5 = 12/35 # >>> sum # 34285714285714285714L # >>> _mul(sum, 20, 35L, 0, 20) # 1199999999999999999990L # >>> _mul(sum, 20, 35L, 0, 18) # 12000000000000000000L # >>> _mul(sum, 20, 35L, 0, 0) # 12L # >>> ################################################################### # Sample rounding functions. ################################################################### # Round to minus infinity. def _roundminf(top, bot): assert bot > 0 return top / bot # Round to plus infinity. def _roundpinf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r: q = q + 1 return q # IEEE nearest/even rounding (closest integer; in case of tie closest # even integer). def _roundne(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and (q & 1) == 1): q = q + 1 return q # "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder # >= half away from 0). def _roundhalf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and q >= 0): q = q + 1 return q # Round toward 0 (throw away remainder). def _roundchop(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r and q < 0: q = q + 1 return q ################################################################### # Kernels for + - * /. ################################################################### DEFAULT_ROUND = _roundne def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 + n2/10**p2) * 10**p == # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p max = p1 # until proven otherwise if p1 < p2: n1 = n1 * _tento(p2 - p1) max = p2 elif p2 < p1: n2 = n2 * _tento(p1 - p2) n3 = n1 + n2 p3 = p - max if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 return _add(n1, p1, -n2, p2, p, round) def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 * n2/10**p2) * 10**p == # (n1*n2)/10**(p1+p2) * 10**p n3 = n1 * n2 p3 = p - p1 - p2 if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 if n2 == 0: raise ZeroDivisionError("scaled integer") # (n1/10**p1 / n2/10**p2) * 10**p == # (n1/n2) * 10**(p2-p1+p) p3 = p2 - p1 + p if p3 > 0: n1 = n1 * _tento(p3) elif p3 < 0: n2 = n2 * _tento(-p3) if n2 < 0: n1 = -n1 n2 = -n2 return round(n1, n2) def _tento(i, _cache={}): assert i >= 0 try: return _cache[i] except KeyError: answer = _cache[i] = 10L ** i return answer
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
What would be scale of the product of two fixed-decimal numbers? E.g. is 2.00L * 2.00L equal to 4.0000L, or equal to 4.00L? There are arguments for either. Same question for division (harder, I think). I like the idea of using the dd.ddL notation for this. I have no time to implement it but would not be unwilling to accept patches. They would have to be accompanied with a wet signature, see http://www.python.org/1.5/wetsign.html. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Guido van Rossum wrote:
I'd be inclined to start by doing some research to see if some standard (SQL?) defines this somewhere. It would be nice if someone has already done the requirements work for us. :)
I like the idea of using the dd.ddL notation for this.
I have no time to implement
Me neither.
it but would not be unwilling to accept patches.
Cool. If no one else volunteers, then I'll try to find a way to get this done (not necessarily by me). I think it is pretty important.
They would have to be accompanied with a wet signature, see http://www.python.org/1.5/wetsign.html.
Yup. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/49442/49442df26fc2edc4837e8e92c8b690fcd540130e" alt=""
Jim Fulton wrote:
Here is what the book "SQL-99 Complete, Really" says that the SQL standard says: - for addition and subtraction of two "exact" (fixed-decimal) numbers, the result has the maximum of the scales. - for multiplication of two "exact" (fixed-decimal) numbers, the result has the sum of the scales. - punts on division - for addition, subtraction, multiplication or division between "exact" (fixed point) and "approximate" (floating point) yields an approximate result. This means that fixed-decimal coerces to float. I'm curious to see who else chips in with examples from other systems. Jim -- Jim Fulton mailto:jim@digicool.com Python Powered! Technical Director (888) 344-4332 http://www.python.org Digital Creations http://www.digicool.com http://www.zope.org Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email address may not be added to any commercial mail list with out my permission. Violation of my privacy with advertising or SPAM will result in a suit for a MINIMUM of $500 damages/incident, $1500 for repeats.
data:image/s3,"s3://crabby-images/2d79d/2d79d8662a2954d7c233449da5e16c43b6b627c1" alt=""
Jim Fulton wrote:
- for addition and subtraction of two "exact" (fixed-decimal) numbers, the result has the maximum of the scales.
One could argue that this is incorrect: if "3.1" means that I know the value to one decimal of precision, and "2.01" means that I know that value to two decimals of precision, stating the result of their sum as "5.11" suggests that I know the result to two decimals of precision, which is of course false: because I only knew one decimal of precision for one of the operands, I only know (at most!) one decimal of precision for the result. Not arguing for this interpretation, just indicating that doing fixed precision arithmetic right is hard. I'm waiting for Tim Peters' contribution, but he's on vacation so it may be a while. --Guido van Rossum (home page: http://www.python.org/~guido/)
data:image/s3,"s3://crabby-images/4c299/4c299dfcd8671c0ce1f071dce620a40b4a7be3e3" alt=""
[Guido]
It's not so much hard as it is arbitrary. The floating-point world is standardized now, but the fixed-point world remains a mish-mash of incompatible legacy schemes carried across generations of products for no reason other than product-specific compatibility. So despite that fixed-point has a specialty audience, whatever rules Python chooses will leave it incompatible with much of that audience's (mixed!) expectations. If fixed-point is needed, and my FixedPoint.py isn't good enough (all other fixed point pkgs I've seen for Python were braindead), then it should be implemented such that developers can control both rounding and precision propagation. I'll attach suitable kernels; they haven't been tested but any bugs discovered will be trivial to fix (there are no difficulties here, but typos are likely); the kernels supply the bulk of what's required, whether implemented in Python or C; various packages can wrap them to supply whatever policies they like; see FixedPoint.py for exact string<->FixedPoint and exact float->FixedPoint conversions; and that's the end of my involvement in fixed-point <wink>. Python should certainly *not* add a "scale factor" to its current long implementation; fixed-point should be a distinct type, as scale-factor fiddling is clumsy and pervasive (long arithmetic is challenging enough to get correct and quick without this obfuscating distraction; and by leaving scale factors out of it, it's much easier to plug in alternative bigint implementations (like GMP)). One other point: some people are going to want BCD (binary-coded decimal), which suffers the same mish-mash of legacy policies, but with a different data representation. The point is that many commercial applications spend much more time doing I/O conversions than arithmetic, and BCD accepts slow arithmetic (in the absence of special HW support) in return for fast scaling & I/O conversion. Forgetting the database-heads for a moment, decimal *floating*-point is what calculators do, so that's what "real people" are most comfortable with. The IEEE-854 std (IEEE-754's younger and friendlier brother) specifies that completely. Add a means to boost "global" precision (a la REXX), and it's a powerful tool even for experts (benefits approximating those of unbounded rational arithmetic but with bounded & user-controllable expense). can-never-have-too-many-numeric-types-but-always-have- too-few-literal-notations-ly y'rs - tim # Kernels for fixed-point decimal arithmetic. # _add, _sub, _mul, _div all have arglist # n1, p1, n2, p2, p, round=DEFAULT_ROUND # n1 and n2 are longs; p1, p2 and p ints >= 0. # The inputs are exactly n1/10**p1 and n2/10**p2. # # The return value is the integer n such that n/10**p is the best # approximation to the infinite-precision result. In other words, p1 # and p2 are the input precisions and p is the desired output # precision, where precision is the # of digits *after* the decimal # point. # # What "best approximation" means is determined by the round function. # In many cases rounding isn't required, but when it is # round(top, bot) # is returned. top and bot are longs, with bot > 0 guaranteed. The # infinite-precision result is top/bot. round must return an integer # (long) approximation to top/bot, using whichever rounding discipline # you want. By default, IEEE round-to-nearest/even is used; see the # _roundXXX functions for examples of suitable rounding functions. # # Note: The only code here that knows we're working in decimal is # function _tento; simply change the "10L" in that to do fixed-point # arithmetic in some other base. # # Example: # # >>> r7 = _div(1L, 0, 7L, 0, 20) # 1/7 # >>> r7 # 14285714285714285714L # >>> r5 = _div(1L, 0, 5L, 0, 20) # 1/5 # >>> r5 # 20000000000000000000L # >>> sum = _add(r7, 20, r5, 20, 20) # 1/7 + 1/5 = 12/35 # >>> sum # 34285714285714285714L # >>> _mul(sum, 20, 35L, 0, 20) # 1199999999999999999990L # >>> _mul(sum, 20, 35L, 0, 18) # 12000000000000000000L # >>> _mul(sum, 20, 35L, 0, 0) # 12L # >>> ################################################################### # Sample rounding functions. ################################################################### # Round to minus infinity. def _roundminf(top, bot): assert bot > 0 return top / bot # Round to plus infinity. def _roundpinf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r: q = q + 1 return q # IEEE nearest/even rounding (closest integer; in case of tie closest # even integer). def _roundne(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and (q & 1) == 1): q = q + 1 return q # "Add a half and chop" rounding (remainder < 1/2 toward 0; remainder # >= half away from 0). def _roundhalf(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot c = cmp(r << 1, bot) # c < 0 <-> r < bot/2, etc if c > 0 or (c == 0 and q >= 0): q = q + 1 return q # Round toward 0 (throw away remainder). def _roundchop(top, bot): assert bot > 0 q, r = divmod(top, bot) # answer is exactly q + r/bot; and 0 <= r < bot if r and q < 0: q = q + 1 return q ################################################################### # Kernels for + - * /. ################################################################### DEFAULT_ROUND = _roundne def _add(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 + n2/10**p2) * 10**p == # (n1*10**(max-p1) + n2*10**(max-p2))/10**max * 10**p max = p1 # until proven otherwise if p1 < p2: n1 = n1 * _tento(p2 - p1) max = p2 elif p2 < p1: n2 = n2 * _tento(p1 - p2) n3 = n1 + n2 p3 = p - max if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _sub(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 return _add(n1, p1, -n2, p2, p, round) def _mul(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 # (n1/10**p1 * n2/10**p2) * 10**p == # (n1*n2)/10**(p1+p2) * 10**p n3 = n1 * n2 p3 = p - p1 - p2 if p3 > 0: n3 = n3 * _tento(p3) elif p3 < 0: n3 = round(n3, _tento(-p3)) return n3 def _div(n1, p1, n2, p2, p, round=DEFAULT_ROUND): assert p1 >= 0 assert p2 >= 0 assert p >= 0 if n2 == 0: raise ZeroDivisionError("scaled integer") # (n1/10**p1 / n2/10**p2) * 10**p == # (n1/n2) * 10**(p2-p1+p) p3 = p2 - p1 + p if p3 > 0: n1 = n1 * _tento(p3) elif p3 < 0: n2 = n2 * _tento(-p3) if n2 < 0: n1 = -n1 n2 = -n2 return round(n1, n2) def _tento(i, _cache={}): assert i >= 0 try: return _cache[i] except KeyError: answer = _cache[i] = 10L ** i return answer
participants (3)
-
Guido van Rossum
-
Jim Fulton
-
Tim Peters