[Python-ideas] Python-ideas Digest, Vol 103, Issue 15

Tue Jun 2 07:22:07 CEST 2015

On Jun 1, 2015, at 20:41, u8y7541 The Awesome Person <surya.subbarao1 at gmail.com> wrote:
> 
> I think you're right. I was also considering ... "editing" my Python distribution. If they didn't implement my suggestion for correcting floats, at least they can fix this, instead of making people hack Python for good results!

If you're going to reply to digests, please learn how to reply inline instead of top-posting (and how to trim out all the irrelevant stuff). It's next to impossible to tell which part of which of the messages you're replying to even in simple cases like this one, with only 4 messages in the digest.

>> On Mon, Jun 1, 2015 at 8:10 PM, <python-ideas-request at python.org> wrote:
>> Send Python-ideas mailing list submissions to
>>         python-ideas at python.org
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://mail.python.org/mailman/listinfo/python-ideas
>> or, via email, send a message with subject or body 'help' to
>>         python-ideas-request at python.org
>> 
>> You can reach the person managing the list at
>>         python-ideas-owner at python.org
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Python-ideas digest..."
>> 
>> 
>> Today's Topics:
>> 
>>    1. Re: Python Float Update (Steven D'Aprano)
>>    2. Re: Python Float Update (Andrew Barnert)
>>    3. Re: Python Float Update (Steven D'Aprano)
>>    4. Re: Python Float Update (Andrew Barnert)
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Tue, 2 Jun 2015 11:37:48 +1000
>> From: Steven D'Aprano <steve at pearwood.info>
>> To: python-ideas at python.org
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <20150602013748.GD932 at ando.pearwood.info>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> On Mon, Jun 01, 2015 at 05:52:35PM +0300, Joonas Liik wrote:
>> 
>> > Having some sort of decimal literal would have some advantages of its own,
>> > for one it could help against this sillyness:
>> >
>> > >>> Decimal(1.3)
>> > Decimal('1.3000000000000000444089209850062616169452667236328125')
>> 
>> Why is that silly? That's the actual value of the binary float 1.3
>> converted into base 10. If you want 1.3 exactly, you can do this:
>> 
>> > >>> Decimal('1.3')
>> > Decimal('1.3')
>> 
>> Is that really so hard for people to learn?
>> 
>> 
>> > I'm not saying that the actual data type needs to be a decimal (
>> > might well be a float but say shove the string repr next to it so it can be
>> > accessed when needed)
>> 
>> You want Decimals to *lie* about what value they have?
>> 
>> I think that's a terrible idea, one which would lead to a whole set of
>> new and exciting surprises when using Decimal. Let me try to predict a
>> few of the questions on Stackoverflow which would follow this change...
>> 
>>   Why is equality so inaccurate in Python?
>> 
>>   py> x = Decimal(1.3)
>>   py> y = Decimal('1.3')
>>   py> x, y
>>   (Decimal('1.3'), Decimal('1.3'))
>>   py> x == y
>>   False
>> 
>>   Why does Python insert extra digits into numbers when I multiply?
>> 
>>   py> x = Decimal(1.3)
>>   py> x
>>   Decimal('1.3')
>>   py> y = 10000000000000000*x
>>   py> y - 13000000000000000
>>   Decimal('0.444089209850062616169452667236328125')
>> 
>> 
>> > ..but this is one really common pitfall for new users, i know its easy to
>> > fix the code above,
>> > but this behavior is very unintuitive.. you essentially get a really
>> > expensive float when you do the obvious thing.
>> 
>> Then don't do the obvious thing.
>> 
>> Sometimes there really is no good alternative to actually knowing what
>> you are doing. Floating point maths is inherently hard, but that's not
>> the problem. There are all sorts of things in programming which are
>> hard, and people learn how to deal with them. The problem is that people
>> *imagine* that floating point is simple, when it is not and can never
>> be. We don't do them any favours by enabling that delusion.
>> 
>> If your needs are light, then you can ignore the complexities of
>> floating point. You really can go a very long way by just rounding the
>> results of your calculations when displaying them. But for anything more
>> than that, we cannot just paper over the floating point complexities
>> without creating new complexities that will burn people.
>> 
>> You don't have to become a floating point guru, but it really isn't
>> onerous to expect people who are programming to learn a few basic
>> programming skills, and that includes a few basic coping strategies for
>> floating point.
>> 
>> 
>> 
>> --
>> Steve
>> 
>> 
>> ------------------------------
>> 
>> Message: 2
>> Date: Mon, 1 Jun 2015 19:21:47 -0700
>> From: Andrew Barnert <abarnert at yahoo.com>
>> To: Andrew Barnert <abarnert at yahoo.com>
>> Cc: Nick Coghlan <ncoghlan at gmail.com>, python-ideas
>>         <python-ideas at python.org>
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <5E8271BF-183E-496D-A556-81C407977FFE at yahoo.com>
>> Content-Type: text/plain;       charset=us-ascii
>> 
>> On Jun 1, 2015, at 19:00, Andrew Barnert <abarnert at yahoo.com> wrote:
>> >
>> >> On Jun 1, 2015, at 18:27, Andrew Barnert via Python-ideas <python-ideas at python.org> wrote:
>> >>
>> >>> On Jun 1, 2015, at 17:08, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> >>>
>> >>> On 2 Jun 2015 08:44, "Andrew Barnert via Python-ideas"
>> >>> <python-ideas at python.org> wrote:
>> >>>> But the basic idea can be extracted out and Pythonified:
>> >>>>
>> >>>> The literal 1.23 no longer gives you a float, but a FloatLiteral, which is either a subclass of float, or an unrelated class that has a __float__ method. Doing any calculation on it gives you a float. But as long as you leave it alone as a FloatLiteral, it has its literal characters available for any function that wants to distinguish FloatLiteral from float, like the Decimal constructor.
>> >>>>
>> >>>> The problem that Python faces that Swift doesn't is that Python doesn't use static typing and implicit compile-time conversions. So in Python, you'd be passing around these larger values and doing the slow conversions at runtime. That may or may not be unacceptable; without actually building it and testing some realistic programs it's pretty hard to guess.
>> >>>
>> >>> Joonas's suggestion of storing the original text representation passed
>> >>> to the float constructor is at least a novel one - it's only the idea
>> >>> of actual decimal literals that was ruled out in the past.
>> >>
>> >> I actually built about half an implementation of something like Swift's LiteralConvertible protocol back when I was teaching myself Swift. But I think I have a simpler version that I could implement much more easily.
>> >>
>> >> Basically, FloatLiteral is just a subclass of float whose __new__ stores its constructor argument. Then decimal.Decimal checks for that stored string and uses it instead of the float value if present. Then there's an import hook that replaces every Num with a call to FloatLiteral.
>> >>
>> >> This design doesn't actually fix everything; in effect, 1.3 actually compiles to FloatLiteral(str(float('1.3')) (because by the time you get to the AST it's too late to avoid that first conversion). Which does actually solve the problem with 1.3, but doesn't solve everything in general (e.g., just feed in a number that has more precision than a double can hold but less than your current decimal context can...).
>> >>
>> >> But it just lets you test whether the implementation makes sense and what the performance effects are, and it's only an hour of work,
>> >
>> > Make that 15 minutes.
>> >
>> > https://github.com/abarnert/floatliteralhack
>> 
>> And as it turns out, hacking the tokens is no harder than hacking the AST (in fact, it's a little easier; I'd just never done it before), so now it does that, meaning you really get the actual literal string from the source, not the repr of the float of that string literal.
>> 
>> Turning this into a real implementation would obviously be more than half an hour's work, but not more than a day or two. Again, I don't think anyone would actually want this, but now people who think they do have an implementation to play with to prove me wrong.
>> 
>> >> and doesn't require anyone to patch their interpreter to play with it. If it seems promising, then hacking the compiler so 2.3 compiles to FloatLiteral('2.3') may be worth doing for a test of the actual functionality.
>> >>
>> >> I'll be glad to hack it up when I get a chance tonight. But personally, I think decimal literals are a better way to go here. Decimal(1.20) magically doing what you want still has all the same downsides as 1.20d (or implicit decimal literals), plus it's more complex, adds performance costs, and doesn't provide nearly as much benefit. (Yes, Decimal(1.20) is a little nicer than Decimal('1.20'), but only a little--and nowhere near as nice as 1.20d).
>> >>
>> >>> Aside from the practical implementation question, the main concern I
>> >>> have with it is that we'd be trading the status quo for a situation
>> >>> where "Decimal(1.3)" and "Decimal(13/10)" gave different answers.
>> >>
>> >> Yes, to solve that you really need Decimal(13)/Decimal(10)... Which implies that maybe the simplification in Decimal(1.3) is more misleading than helpful. (Notice that this problem also doesn't arise for decimal literals--13/10d is int vs. Decimal division, which is correct out of the box. Or, if you want prefixes, d13/10 is Decimal vs. int division.)
>> >>
>> >>> It seems to me that a potentially better option might be to adjust the
>> >>> implicit float->Decimal conversion in the Decimal constructor to use
>> >>> the same algorithm as we now use for float.__repr__ [1], where we look
>> >>> for the shortest decimal representation that gives the same answer
>> >>> when rendered as a float. At the moment you have to indirect through
>> >>> str() or repr() to get that behaviour:
>> >>>
>> >>>>>> from decimal import Decimal as D
>> >>>>>> 1.3
>> >>> 1.3
>> >>>>>> D('1.3')
>> >>> Decimal('1.3')
>> >>>>>> D(1.3)
>> >>> Decimal('1.3000000000000000444089209850062616169452667236328125')
>> >>>>>> D(str(1.3))
>> >>> Decimal('1.3')
>> >>>
>> >>> Cheers,
>> >>> Nick.
>> >>>
>> >>> [1] http://bugs.python.org/issue1580
>> >> _______________________________________________
>> >> Python-ideas mailing list
>> >> Python-ideas at python.org
>> >> https://mail.python.org/mailman/listinfo/python-ideas
>> >> Code of Conduct: http://python.org/psf/codeofconduct/
>> 
>> 
>> ------------------------------
>> 
>> Message: 3
>> Date: Tue, 2 Jun 2015 13:00:40 +1000
>> From: Steven D'Aprano <steve at pearwood.info>
>> To: python-ideas at python.org
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <20150602030040.GF932 at ando.pearwood.info>
>> Content-Type: text/plain; charset=utf-8
>> 
>> Nicholas,
>> 
>> Your email client appears to not be quoting text you quote. It is a
>> conventional to use a leading > for quoting, perhaps you could configure
>> your mail program to do so? The good ones even have a "Paste As Quote"
>> command.
>> 
>> On with the substance of your post...
>> 
>> On Mon, Jun 01, 2015 at 01:24:32PM -0400, Nicholas Chammas wrote:
>> 
>> > I guess it?s a non-trivial tradeoff. But I would lean towards considering
>> > people likely to be affected by the performance hit as doing something ?not
>> > common?. Like, if they are doing that many calculations that it matters,
>> > perhaps it makes sense to ask them to explicitly ask for floats vs.
>> > decimals, in exchange for giving the majority who wouldn?t notice a
>> > performance difference a better user experience.
>> 
>> Changing from binary floats to decimal floats by default is a big,
>> backwards incompatible change. Even if it's a good idea, we're
>> constrained by backwards compatibility: I would imagine we wouldn't want
>> to even introduce this feature until the majority of people are using
>> Python 3 rather than Python 2, and then we'd probably want to introduce
>> it using a "from __future__ import decimal_floats" directive.
>> 
>> So I would guess this couldn't happen until probably 2020 or so.
>> 
>> But we could introduce a decimal literal, say 1.1d for Decimal("1.1").
>> The first prerequisite is that we have a fast Decimal implementation,
>> which we now have. Next we would have to decide how the decimal literals
>> would interact with the decimal module. Do we include full support of
>> the entire range of decimal features, including globally configurable
>> precision and other modes? Or just a subset? How will these decimals
>> interact with other numeric types, like float and Fraction? At the
>> moment, Decimal isn't even part of the numeric tower.
>> 
>> There's a lot of ground to cover, it's not a trivial change, and will
>> definitely need a PEP.
>> 
>> 
>> > How many of your examples are inherent limitations of decimals vs. problems
>> > that can be improved upon?
>> 
>> In one sense, they are inherent limitations of floating point numbers
>> regardless of base. Whether binary, decimal, hexadecimal as used in some
>> IBM computers, or something else, you're going to see the same problems.
>> Only the specific details will vary, e.g. 1/3 cannot be represented
>> exactly in base 2 or base 10, but if you constructed a base 3 float, it
>> would be exact.
>> 
>> In another sense, Decimal has a big advantage that it is much more
>> configurable than Python's floats. Decimal lets you configure the
>> precision, rounding mode, error handling and more. That's not inherent
>> to base 10 calculations, you can do exactly the same thing for binary
>> floats too, but Python doesn't offer that feature for floats, only for
>> Decimals.
>> 
>> But no matter how you configure Decimal, all you can do is shift the
>> gotchas around. The issue really is inherent to the nature of the
>> problem, and you cannot defeat the universe. Regardless of what
>> base you use, binary or decimal or something else, or how many digits
>> precision, you're still trying to simulate an uncountably infinite
>> continuous, infinitely divisible number line using a finite,
>> discontinuous set of possible values. Something has to give.
>> 
>> (For the record, when I say "uncountably infinite", I don't just mean
>> "too many to count", it's a technical term. To oversimplify horribly, it
>> means "larger than infinity" in some sense. It's off-topic for here,
>> but if anyone is interested in learning more, you can email me off-list,
>> or google for "countable vs uncountable infinity".)
>> 
>> Basically, you're trying to squeeze an infinite number of real numbers
>> into a finite amount of memory. It can't be done. Consequently, there
>> will *always* be some calculations where the true value simply cannot be
>> calculated and the answer you get is slightly too big or slightly too
>> small. All the other floating point gotchas follow from that simple
>> fact.
>> 
>> 
>> > Admittedly, the only place where I?ve played with decimals extensively is
>> > on Microsoft?s SQL Server (where they are the default literal
>> > <https://msdn.microsoft.com/en-us/library/ms179899.aspx>). I?ve stumbled in
>> > the past on my own decimal gotchas
>> > <http://dba.stackexchange.com/q/18997/2660>, but looking at your examples
>> > and trying them on SQL Server I suspect that most of the problems you show
>> > are problems of precision and scale.
>> 
>> No. Change the precision and scale, and some *specific* problems goes
>> away, but they reappear with other numbers.
>> 
>> Besides, at the point that you're talking about setting the precision,
>> we're really not talking about making things easy for beginners any
>> more.
>> 
>> And not all floating point issues are related to precision and scale in
>> decimal. You cannot divide a cake into exactly three equal pieces in
>> Decimal any more than you can divide a cake into exactly three equal
>> pieces in binary. All you can hope for is to choose a precision were the
>> rounding errors in one part of your calculation will be cancelled by the
>> rounding errors in another part of your calculation. And that precision
>> will be different for any two arbitrary calculations.
>> 
>> 
>> 
>> --
>> Steve
>> 
>> 
>> ------------------------------
>> 
>> Message: 4
>> Date: Mon, 1 Jun 2015 20:10:29 -0700
>> From: Andrew Barnert <abarnert at yahoo.com>
>> To: Steven D'Aprano <steve at pearwood.info>
>> Cc: "python-ideas at python.org" <python-ideas at python.org>
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <79C16144-8BF7-4260-A356-DD4E8D97BAAD at yahoo.com>
>> Content-Type: text/plain;       charset=us-ascii
>> 
>> On Jun 1, 2015, at 18:58, Steven D'Aprano <steve at pearwood.info> wrote:
>> >
>> >> On Tue, Jun 02, 2015 at 10:08:37AM +1000, Nick Coghlan wrote:
>> >>
>> >> It seems to me that a potentially better option might be to adjust the
>> >> implicit float->Decimal conversion in the Decimal constructor to use
>> >> the same algorithm as we now use for float.__repr__ [1], where we look
>> >> for the shortest decimal representation that gives the same answer
>> >> when rendered as a float. At the moment you have to indirect through
>> >> str() or repr() to get that behaviour:
>> >
>> > Apart from the questions of whether such a change would be allowed by
>> > the Decimal specification,
>> 
>> As far as I know, GDAS doesn't specify anything about implicit conversion from floats. As long as the required explicit conversion function (which I think is from_float?) exists and does the required thing.
>> 
>> As a side note, has anyone considered whether it's worth switching to IEEE-754-2008 as the controlling specification? There may be a good reason not to do so; I'm just curious whether someone has thought it through and made the case.
>> 
>> > and the breaking of backwards compatibility,
>> > I would really hate that change for another reason.
>> >
>> > At the moment, a good, cheap way to find out what a binary float "really
>> > is" (in some sense) is to convert it to Decimal and see what you get:
>> >
>> > Decimal(1.3)
>> > -> Decimal('1.3000000000000000444089209850062616169452667236328125')
>> >
>> > If you want conversion from repr, then you can be explicit about it:
>> >
>> > Decimal(repr(1.3))
>> > -> Decimal('1.3')
>> >
>> > ("Explicit is better than implicit", as they say...)
>> >
>> > Although in fairness I suppose that if this change happens, we could
>> > keep the old behaviour in the from_float method:
>> >
>> > # hypothetical future behaviour
>> > Decimal(1.3)
>> > -> Decimal('1.3')
>> > Decimal.from_float(1.3)
>> > -> Decimal('1.3000000000000000444089209850062616169452667236328125')
>> >
>> > But all things considered, I don't think we're doing people any favours
>> > by changing the behaviour of float->Decimal conversions to implicitly
>> > use the repr() instead of being exact. I expect this strategy is like
>> > trying to flatten a bubble under wallpaper: all you can do is push the
>> > gotchas and surprises to somewhere else.
>> >
>> > Oh, another thought... Decimals could gain yet another conversion
>> > method, one which implicitly uses the float repr, but signals if it was
>> > an inexact conversion or not. Explicitly calling repr can never signal,
>> > since the conversion occurs outside of the Decimal constructor and
>> > Decimal sees only the string:
>> >
>> > Decimal(repr(1.3)) cannot signal Inexact.
>> >
>> > But:
>> >
>> > Decimal.from_nearest_float(1.5)  # exact
>> > Decimal.from_nearest_float(1.3)  # signals Inexact
>> >
>> > That might be useful, but probably not to beginners.
>> 
>> I think this might be worth having whether the default constructor is changed or not.
>> 
>> I can't think of too many programs where I'm pretty sure I have an exactly-representable decimal as a float but want to check to be sure... but for interactive use in IPython (especially when I'm specifically trying to explain to someone why just using Decimal instead of float will/will not solve their problem) I could see using it.
>> 
>> ------------------------------
>> 
>> Subject: Digest Footer
>> 
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> 
>> 
>> ------------------------------
>> 
>> End of Python-ideas Digest, Vol 103, Issue 15
>> *********************************************
> 
> 
> 
> -- 
> -Surya Subbarao
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150601/3a934fb9/attachment-0001.html>