[Python-ideas] Python-ideas Digest, Vol 103, Issue 15
u8y7541 The Awesome Person
surya.subbarao1 at gmail.com
Wed Jun 3 00:28:33 CEST 2015
What do you mean by replying inine?
On Mon, Jun 1, 2015 at 10:22 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Jun 1, 2015, at 20:41, u8y7541 The Awesome Person
> <surya.subbarao1 at gmail.com> wrote:
>
> I think you're right. I was also considering ... "editing" my Python
> distribution. If they didn't implement my suggestion for correcting floats,
> at least they can fix this, instead of making people hack Python for good
> results!
>
>
> If you're going to reply to digests, please learn how to reply inline
> instead of top-posting (and how to trim out all the irrelevant stuff). It's
> next to impossible to tell which part of which of the messages you're
> replying to even in simple cases like this one, with only 4 messages in the
> digest.
>
> On Mon, Jun 1, 2015 at 8:10 PM, <python-ideas-request at python.org> wrote:
>>
>> Send Python-ideas mailing list submissions to
>> python-ideas at python.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://mail.python.org/mailman/listinfo/python-ideas
>> or, via email, send a message with subject or body 'help' to
>> python-ideas-request at python.org
>>
>> You can reach the person managing the list at
>> python-ideas-owner at python.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Python-ideas digest..."
>>
>>
>> Today's Topics:
>>
>> 1. Re: Python Float Update (Steven D'Aprano)
>> 2. Re: Python Float Update (Andrew Barnert)
>> 3. Re: Python Float Update (Steven D'Aprano)
>> 4. Re: Python Float Update (Andrew Barnert)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Tue, 2 Jun 2015 11:37:48 +1000
>> From: Steven D'Aprano <steve at pearwood.info>
>> To: python-ideas at python.org
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <20150602013748.GD932 at ando.pearwood.info>
>> Content-Type: text/plain; charset=us-ascii
>>
>> On Mon, Jun 01, 2015 at 05:52:35PM +0300, Joonas Liik wrote:
>>
>> > Having some sort of decimal literal would have some advantages of its
>> > own,
>> > for one it could help against this sillyness:
>> >
>> > >>> Decimal(1.3)
>> > Decimal('1.3000000000000000444089209850062616169452667236328125')
>>
>> Why is that silly? That's the actual value of the binary float 1.3
>> converted into base 10. If you want 1.3 exactly, you can do this:
>>
>> > >>> Decimal('1.3')
>> > Decimal('1.3')
>>
>> Is that really so hard for people to learn?
>>
>>
>> > I'm not saying that the actual data type needs to be a decimal (
>> > might well be a float but say shove the string repr next to it so it can
>> > be
>> > accessed when needed)
>>
>> You want Decimals to *lie* about what value they have?
>>
>> I think that's a terrible idea, one which would lead to a whole set of
>> new and exciting surprises when using Decimal. Let me try to predict a
>> few of the questions on Stackoverflow which would follow this change...
>>
>> Why is equality so inaccurate in Python?
>>
>> py> x = Decimal(1.3)
>> py> y = Decimal('1.3')
>> py> x, y
>> (Decimal('1.3'), Decimal('1.3'))
>> py> x == y
>> False
>>
>> Why does Python insert extra digits into numbers when I multiply?
>>
>> py> x = Decimal(1.3)
>> py> x
>> Decimal('1.3')
>> py> y = 10000000000000000*x
>> py> y - 13000000000000000
>> Decimal('0.444089209850062616169452667236328125')
>>
>>
>> > ..but this is one really common pitfall for new users, i know its easy
>> > to
>> > fix the code above,
>> > but this behavior is very unintuitive.. you essentially get a really
>> > expensive float when you do the obvious thing.
>>
>> Then don't do the obvious thing.
>>
>> Sometimes there really is no good alternative to actually knowing what
>> you are doing. Floating point maths is inherently hard, but that's not
>> the problem. There are all sorts of things in programming which are
>> hard, and people learn how to deal with them. The problem is that people
>> *imagine* that floating point is simple, when it is not and can never
>> be. We don't do them any favours by enabling that delusion.
>>
>> If your needs are light, then you can ignore the complexities of
>> floating point. You really can go a very long way by just rounding the
>> results of your calculations when displaying them. But for anything more
>> than that, we cannot just paper over the floating point complexities
>> without creating new complexities that will burn people.
>>
>> You don't have to become a floating point guru, but it really isn't
>> onerous to expect people who are programming to learn a few basic
>> programming skills, and that includes a few basic coping strategies for
>> floating point.
>>
>>
>>
>> --
>> Steve
>>
>>
>> ------------------------------
>>
>> Message: 2
>> Date: Mon, 1 Jun 2015 19:21:47 -0700
>> From: Andrew Barnert <abarnert at yahoo.com>
>> To: Andrew Barnert <abarnert at yahoo.com>
>> Cc: Nick Coghlan <ncoghlan at gmail.com>, python-ideas
>> <python-ideas at python.org>
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <5E8271BF-183E-496D-A556-81C407977FFE at yahoo.com>
>> Content-Type: text/plain; charset=us-ascii
>>
>> On Jun 1, 2015, at 19:00, Andrew Barnert <abarnert at yahoo.com> wrote:
>> >
>> >> On Jun 1, 2015, at 18:27, Andrew Barnert via Python-ideas
>> >> <python-ideas at python.org> wrote:
>> >>
>> >>> On Jun 1, 2015, at 17:08, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> >>>
>> >>> On 2 Jun 2015 08:44, "Andrew Barnert via Python-ideas"
>> >>> <python-ideas at python.org> wrote:
>> >>>> But the basic idea can be extracted out and Pythonified:
>> >>>>
>> >>>> The literal 1.23 no longer gives you a float, but a FloatLiteral,
>> >>>> which is either a subclass of float, or an unrelated class that has a
>> >>>> __float__ method. Doing any calculation on it gives you a float. But as long
>> >>>> as you leave it alone as a FloatLiteral, it has its literal characters
>> >>>> available for any function that wants to distinguish FloatLiteral from
>> >>>> float, like the Decimal constructor.
>> >>>>
>> >>>> The problem that Python faces that Swift doesn't is that Python
>> >>>> doesn't use static typing and implicit compile-time conversions. So in
>> >>>> Python, you'd be passing around these larger values and doing the slow
>> >>>> conversions at runtime. That may or may not be unacceptable; without
>> >>>> actually building it and testing some realistic programs it's pretty hard to
>> >>>> guess.
>> >>>
>> >>> Joonas's suggestion of storing the original text representation passed
>> >>> to the float constructor is at least a novel one - it's only the idea
>> >>> of actual decimal literals that was ruled out in the past.
>> >>
>> >> I actually built about half an implementation of something like Swift's
>> >> LiteralConvertible protocol back when I was teaching myself Swift. But I
>> >> think I have a simpler version that I could implement much more easily.
>> >>
>> >> Basically, FloatLiteral is just a subclass of float whose __new__
>> >> stores its constructor argument. Then decimal.Decimal checks for that stored
>> >> string and uses it instead of the float value if present. Then there's an
>> >> import hook that replaces every Num with a call to FloatLiteral.
>> >>
>> >> This design doesn't actually fix everything; in effect, 1.3 actually
>> >> compiles to FloatLiteral(str(float('1.3')) (because by the time you get to
>> >> the AST it's too late to avoid that first conversion). Which does actually
>> >> solve the problem with 1.3, but doesn't solve everything in general (e.g.,
>> >> just feed in a number that has more precision than a double can hold but
>> >> less than your current decimal context can...).
>> >>
>> >> But it just lets you test whether the implementation makes sense and
>> >> what the performance effects are, and it's only an hour of work,
>> >
>> > Make that 15 minutes.
>> >
>> > https://github.com/abarnert/floatliteralhack
>>
>> And as it turns out, hacking the tokens is no harder than hacking the AST
>> (in fact, it's a little easier; I'd just never done it before), so now it
>> does that, meaning you really get the actual literal string from the source,
>> not the repr of the float of that string literal.
>>
>> Turning this into a real implementation would obviously be more than half
>> an hour's work, but not more than a day or two. Again, I don't think anyone
>> would actually want this, but now people who think they do have an
>> implementation to play with to prove me wrong.
>>
>> >> and doesn't require anyone to patch their interpreter to play with it.
>> >> If it seems promising, then hacking the compiler so 2.3 compiles to
>> >> FloatLiteral('2.3') may be worth doing for a test of the actual
>> >> functionality.
>> >>
>> >> I'll be glad to hack it up when I get a chance tonight. But personally,
>> >> I think decimal literals are a better way to go here. Decimal(1.20)
>> >> magically doing what you want still has all the same downsides as 1.20d (or
>> >> implicit decimal literals), plus it's more complex, adds performance costs,
>> >> and doesn't provide nearly as much benefit. (Yes, Decimal(1.20) is a little
>> >> nicer than Decimal('1.20'), but only a little--and nowhere near as nice as
>> >> 1.20d).
>> >>
>> >>> Aside from the practical implementation question, the main concern I
>> >>> have with it is that we'd be trading the status quo for a situation
>> >>> where "Decimal(1.3)" and "Decimal(13/10)" gave different answers.
>> >>
>> >> Yes, to solve that you really need Decimal(13)/Decimal(10)... Which
>> >> implies that maybe the simplification in Decimal(1.3) is more misleading
>> >> than helpful. (Notice that this problem also doesn't arise for decimal
>> >> literals--13/10d is int vs. Decimal division, which is correct out of the
>> >> box. Or, if you want prefixes, d13/10 is Decimal vs. int division.)
>> >>
>> >>> It seems to me that a potentially better option might be to adjust the
>> >>> implicit float->Decimal conversion in the Decimal constructor to use
>> >>> the same algorithm as we now use for float.__repr__ [1], where we look
>> >>> for the shortest decimal representation that gives the same answer
>> >>> when rendered as a float. At the moment you have to indirect through
>> >>> str() or repr() to get that behaviour:
>> >>>
>> >>>>>> from decimal import Decimal as D
>> >>>>>> 1.3
>> >>> 1.3
>> >>>>>> D('1.3')
>> >>> Decimal('1.3')
>> >>>>>> D(1.3)
>> >>> Decimal('1.3000000000000000444089209850062616169452667236328125')
>> >>>>>> D(str(1.3))
>> >>> Decimal('1.3')
>> >>>
>> >>> Cheers,
>> >>> Nick.
>> >>>
>> >>> [1] http://bugs.python.org/issue1580
>> >> _______________________________________________
>> >> Python-ideas mailing list
>> >> Python-ideas at python.org
>> >> https://mail.python.org/mailman/listinfo/python-ideas
>> >> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>>
>> ------------------------------
>>
>> Message: 3
>> Date: Tue, 2 Jun 2015 13:00:40 +1000
>> From: Steven D'Aprano <steve at pearwood.info>
>> To: python-ideas at python.org
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <20150602030040.GF932 at ando.pearwood.info>
>> Content-Type: text/plain; charset=utf-8
>>
>> Nicholas,
>>
>> Your email client appears to not be quoting text you quote. It is a
>> conventional to use a leading > for quoting, perhaps you could configure
>> your mail program to do so? The good ones even have a "Paste As Quote"
>> command.
>>
>> On with the substance of your post...
>>
>> On Mon, Jun 01, 2015 at 01:24:32PM -0400, Nicholas Chammas wrote:
>>
>> > I guess it?s a non-trivial tradeoff. But I would lean towards
>> > considering
>> > people likely to be affected by the performance hit as doing something
>> > ?not
>> > common?. Like, if they are doing that many calculations that it matters,
>> > perhaps it makes sense to ask them to explicitly ask for floats vs.
>> > decimals, in exchange for giving the majority who wouldn?t notice a
>> > performance difference a better user experience.
>>
>> Changing from binary floats to decimal floats by default is a big,
>> backwards incompatible change. Even if it's a good idea, we're
>> constrained by backwards compatibility: I would imagine we wouldn't want
>> to even introduce this feature until the majority of people are using
>> Python 3 rather than Python 2, and then we'd probably want to introduce
>> it using a "from __future__ import decimal_floats" directive.
>>
>> So I would guess this couldn't happen until probably 2020 or so.
>>
>> But we could introduce a decimal literal, say 1.1d for Decimal("1.1").
>> The first prerequisite is that we have a fast Decimal implementation,
>> which we now have. Next we would have to decide how the decimal literals
>> would interact with the decimal module. Do we include full support of
>> the entire range of decimal features, including globally configurable
>> precision and other modes? Or just a subset? How will these decimals
>> interact with other numeric types, like float and Fraction? At the
>> moment, Decimal isn't even part of the numeric tower.
>>
>> There's a lot of ground to cover, it's not a trivial change, and will
>> definitely need a PEP.
>>
>>
>> > How many of your examples are inherent limitations of decimals vs.
>> > problems
>> > that can be improved upon?
>>
>> In one sense, they are inherent limitations of floating point numbers
>> regardless of base. Whether binary, decimal, hexadecimal as used in some
>> IBM computers, or something else, you're going to see the same problems.
>> Only the specific details will vary, e.g. 1/3 cannot be represented
>> exactly in base 2 or base 10, but if you constructed a base 3 float, it
>> would be exact.
>>
>> In another sense, Decimal has a big advantage that it is much more
>> configurable than Python's floats. Decimal lets you configure the
>> precision, rounding mode, error handling and more. That's not inherent
>> to base 10 calculations, you can do exactly the same thing for binary
>> floats too, but Python doesn't offer that feature for floats, only for
>> Decimals.
>>
>> But no matter how you configure Decimal, all you can do is shift the
>> gotchas around. The issue really is inherent to the nature of the
>> problem, and you cannot defeat the universe. Regardless of what
>> base you use, binary or decimal or something else, or how many digits
>> precision, you're still trying to simulate an uncountably infinite
>> continuous, infinitely divisible number line using a finite,
>> discontinuous set of possible values. Something has to give.
>>
>> (For the record, when I say "uncountably infinite", I don't just mean
>> "too many to count", it's a technical term. To oversimplify horribly, it
>> means "larger than infinity" in some sense. It's off-topic for here,
>> but if anyone is interested in learning more, you can email me off-list,
>> or google for "countable vs uncountable infinity".)
>>
>> Basically, you're trying to squeeze an infinite number of real numbers
>> into a finite amount of memory. It can't be done. Consequently, there
>> will *always* be some calculations where the true value simply cannot be
>> calculated and the answer you get is slightly too big or slightly too
>> small. All the other floating point gotchas follow from that simple
>> fact.
>>
>>
>> > Admittedly, the only place where I?ve played with decimals extensively
>> > is
>> > on Microsoft?s SQL Server (where they are the default literal
>> > <https://msdn.microsoft.com/en-us/library/ms179899.aspx>). I?ve stumbled
>> > in
>> > the past on my own decimal gotchas
>> > <http://dba.stackexchange.com/q/18997/2660>, but looking at your
>> > examples
>> > and trying them on SQL Server I suspect that most of the problems you
>> > show
>> > are problems of precision and scale.
>>
>> No. Change the precision and scale, and some *specific* problems goes
>> away, but they reappear with other numbers.
>>
>> Besides, at the point that you're talking about setting the precision,
>> we're really not talking about making things easy for beginners any
>> more.
>>
>> And not all floating point issues are related to precision and scale in
>> decimal. You cannot divide a cake into exactly three equal pieces in
>> Decimal any more than you can divide a cake into exactly three equal
>> pieces in binary. All you can hope for is to choose a precision were the
>> rounding errors in one part of your calculation will be cancelled by the
>> rounding errors in another part of your calculation. And that precision
>> will be different for any two arbitrary calculations.
>>
>>
>>
>> --
>> Steve
>>
>>
>> ------------------------------
>>
>> Message: 4
>> Date: Mon, 1 Jun 2015 20:10:29 -0700
>> From: Andrew Barnert <abarnert at yahoo.com>
>> To: Steven D'Aprano <steve at pearwood.info>
>> Cc: "python-ideas at python.org" <python-ideas at python.org>
>> Subject: Re: [Python-ideas] Python Float Update
>> Message-ID: <79C16144-8BF7-4260-A356-DD4E8D97BAAD at yahoo.com>
>> Content-Type: text/plain; charset=us-ascii
>>
>> On Jun 1, 2015, at 18:58, Steven D'Aprano <steve at pearwood.info> wrote:
>> >
>> >> On Tue, Jun 02, 2015 at 10:08:37AM +1000, Nick Coghlan wrote:
>> >>
>> >> It seems to me that a potentially better option might be to adjust the
>> >> implicit float->Decimal conversion in the Decimal constructor to use
>> >> the same algorithm as we now use for float.__repr__ [1], where we look
>> >> for the shortest decimal representation that gives the same answer
>> >> when rendered as a float. At the moment you have to indirect through
>> >> str() or repr() to get that behaviour:
>> >
>> > Apart from the questions of whether such a change would be allowed by
>> > the Decimal specification,
>>
>> As far as I know, GDAS doesn't specify anything about implicit conversion
>> from floats. As long as the required explicit conversion function (which I
>> think is from_float?) exists and does the required thing.
>>
>> As a side note, has anyone considered whether it's worth switching to
>> IEEE-754-2008 as the controlling specification? There may be a good reason
>> not to do so; I'm just curious whether someone has thought it through and
>> made the case.
>>
>> > and the breaking of backwards compatibility,
>> > I would really hate that change for another reason.
>> >
>> > At the moment, a good, cheap way to find out what a binary float "really
>> > is" (in some sense) is to convert it to Decimal and see what you get:
>> >
>> > Decimal(1.3)
>> > -> Decimal('1.3000000000000000444089209850062616169452667236328125')
>> >
>> > If you want conversion from repr, then you can be explicit about it:
>> >
>> > Decimal(repr(1.3))
>> > -> Decimal('1.3')
>> >
>> > ("Explicit is better than implicit", as they say...)
>> >
>> > Although in fairness I suppose that if this change happens, we could
>> > keep the old behaviour in the from_float method:
>> >
>> > # hypothetical future behaviour
>> > Decimal(1.3)
>> > -> Decimal('1.3')
>> > Decimal.from_float(1.3)
>> > -> Decimal('1.3000000000000000444089209850062616169452667236328125')
>> >
>> > But all things considered, I don't think we're doing people any favours
>> > by changing the behaviour of float->Decimal conversions to implicitly
>> > use the repr() instead of being exact. I expect this strategy is like
>> > trying to flatten a bubble under wallpaper: all you can do is push the
>> > gotchas and surprises to somewhere else.
>> >
>> > Oh, another thought... Decimals could gain yet another conversion
>> > method, one which implicitly uses the float repr, but signals if it was
>> > an inexact conversion or not. Explicitly calling repr can never signal,
>> > since the conversion occurs outside of the Decimal constructor and
>> > Decimal sees only the string:
>> >
>> > Decimal(repr(1.3)) cannot signal Inexact.
>> >
>> > But:
>> >
>> > Decimal.from_nearest_float(1.5) # exact
>> > Decimal.from_nearest_float(1.3) # signals Inexact
>> >
>> > That might be useful, but probably not to beginners.
>>
>> I think this might be worth having whether the default constructor is
>> changed or not.
>>
>> I can't think of too many programs where I'm pretty sure I have an
>> exactly-representable decimal as a float but want to check to be sure... but
>> for interactive use in IPython (especially when I'm specifically trying to
>> explain to someone why just using Decimal instead of float will/will not
>> solve their problem) I could see using it.
>>
>> ------------------------------
>>
>> Subject: Digest Footer
>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>>
>>
>> ------------------------------
>>
>> End of Python-ideas Digest, Vol 103, Issue 15
>> *********************************************
>
>
>
>
> --
> -Surya Subbarao
--
-Surya Subbarao
More information about the Python-ideas
mailing list