Mailman 3 stupid floating point question... - Python-Dev

newer
WHOA!!! Screw up on my part: how...

stupid floating point question...

older
.pyc broken on Windows -- anywhere...

Martin von Loewis

28 Sep 2000 28 Sep '00

3:37 p.m.

A *good* compiler won't collapse *any* fp expressions at compile-time, because doing so can change the 754 semantics at runtime (for example, the evaluation of 1./6 triggers the 754 "inexact" signal, and the compiler has no way to know whether the user is expecting that to happen at runtime, so a good compiler will leave it alone

Of course, that doesn't say anything about what *most* compilers do. For example, gcc, on i586-pc-linux-gnu, compiles double foo(){ return (double)1/6; } into .LC0: .long 0x55555555,0x3fc55555 .text .align 4 .globl foo .type foo,@function foo: fldl .LC0 ret when compiling with -fomit-frame-pointer -O2. That still doesn't say anything about what most compilers do - if there is interest, we could perform a comparative study on the subject :-) The "would break 754" argument is pretty weak, IMO - gcc, for example, doesn't claim to comply to that standard. Regards, Martin

Show replies by date

Tim Peters

28 Sep 28 Sep

7:59 p.m.

[Tim]

...

A *good* compiler won't collapse *any* fp expressions at compile-time ...

[Martin von Loewis]

...

Of course, that doesn't say anything about what *most* compilers do.

Doesn't matter in this case; I told /F not to worry about it having taken that all into account. Almost all C compilers do a piss-poor job of taking floating-point seriously, but it doesn't really matter for the purpose /F has in mind. [an example of gcc precomputing the best possible result]

...

return (double)1/6; ... .long 0x55555555,0x3fc55555

No problem. If you set the HW rounding mode to +infinity during compilation, the first chunk there would end with a 6 instead. Would affect the tail end of the repr(), but not the str().

...

... when compiling with -fomit-frame-pointer -O2. That still doesn't say anything about what most compilers do - if there is interest, we could perform a comparative study on the subject :-)

No need.

...

The "would break 754" argument is pretty weak, IMO - gcc, for example, doesn't claim to comply to that standard.

/F's question was about fp. 754 is the only hope he has for any x-platform consistency (C89 alone gives no hope at all, and no basis for answering his question). To the extent that a C compiler ignores 754, it makes x-platform fp consistency impossible (which, btw, Python inherits from C: we can't even manage to get string<->float working consistently across 100% 754-conforming platforms!). Whether that's a weak argument or not depends entirely on how important x-platform consistency is to a given app. In /F's specific case, a sloppy compiler is "good enough". i'm-the-only-compiler-writer-i-ever-met-who-understood-fp<0.5-wink>-ly y'rs - tim

Fredrik Lundh

8:40 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

tim wrote:

...

...
Of course, that doesn't say anything about what *most* compilers do.

Doesn't matter in this case; I told /F not to worry about it having taken that all into account. Almost all C compilers do a piss-poor job of taking floating-point seriously, but it doesn't really matter for the purpose /F has in mind.

to make it clear for everyone: I'm planning to get rid of the last remaining switch statement in unicodectype.c ("numerical value"), and replace the doubles in there with rationals. the problem here is that MAL's new test suite uses "str" on the return value from that function, and it would a bit annoying if we ended up with a Unicode test that might fail on platforms with lousy floating point support... ::: on the other hand, I'm not sure I think it's a really good idea to have "numeric" return a floating point value. consider this:

...

...
...
import unicodedata unicodedata.numeric(u"\N{VULGAR FRACTION ONE THIRD}") 0.33333333333333331

(the glyph looks like "1/3", and that's also what the numeric property field in the Unicode database says) ::: if I had access to the time machine, I'd change it to:

...

...
...
unicodedata.numeric(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

...but maybe we can add an alternate API that returns the *exact* fraction (as a numerator/denominator tuple)?

...

...
...
unicodedata.numeric2(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

(hopefully, someone will come up with a better name) </F>

The Ping of Death

8:35 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

On Thu, 28 Sep 2000, Fredrik Lundh wrote:

...

if I had access to the time machine, I'd change it to:

...
...
...
unicodedata.numeric(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

...but maybe we can add an alternate API that returns the *exact* fraction (as a numerator/denominator tuple)?

...
...
...
unicodedata.numeric2(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

(hopefully, someone will come up with a better name)

unicodedata.rational might be an obvious choice. >>> unicodedata.rational(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3) -- ?!ng

Tim Peters

8:52 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

[/F]

...

...but maybe we can add an alternate API that returns the *exact* fraction (as a numerator/denominator tuple)?

...
...
...
unicodedata.numeric2(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

(hopefully, someone will come up with a better name)

[The Ping of Death] LOL! Great name, Ping.

...

unicodedata.rational might be an obvious choice.

>>> unicodedata.rational(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

Perfect -- another great name. Beats all heck out of unicodedata.vulgar() too. leaving-it-up-to-/f-to-decide-what-.rational()-should-return-for-pi- ly y'ts - the timmy of death

Fredrik Lundh

9:14 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

tim wrote:

...

leaving-it-up-to-/f-to-decide-what-.rational()-should-return-for-pi- ly y'ts - the timmy of death

oh, the unicode folks have figured that one out:

...

...
...
unicodedata.numeric(u"\N{GREEK PI SYMBOL}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character

</F>

Tim Peters

10:12 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

[tim]

...

leaving-it-up-to-/f-to-decide-what-.rational()-should-return-for-pi- ly y'ts - the timmy of death

[/F]

...

oh, the unicode folks have figured that one out:

...
...
...
unicodedata.numeric(u"\N{GREEK PI SYMBOL}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character

Ya, except I'm starting to suspect they're not floating-point experts either:

...

...
...
unicodedata.numeric(u"\N{PLANCK CONSTANT OVER TWO PI}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{EULER CONSTANT}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{AIRSPEED OF AFRICAN SWALLOW}") UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

M.-A. Lemburg

10:33 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

Tim Peters wrote:

...

[tim]

...
leaving-it-up-to-/f-to-decide-what-.rational()-should-return-for-pi- ly y'ts - the timmy of death

[/F]

...
oh, the unicode folks have figured that one out:

...
...
...
unicodedata.numeric(u"\N{GREEK PI SYMBOL}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character

Ya, except I'm starting to suspect they're not floating-point experts either:

...
...
...
unicodedata.numeric(u"\N{PLANCK CONSTANT OVER TWO PI}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{EULER CONSTANT}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{AIRSPEED OF AFRICAN SWALLOW}") UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

Perhaps you should submit these for Unicode 4.0 ;-) But really, I don't suspect that anyone is going to do serious character to number conversion on these esoteric characters. Plain old digits will do just as they always have (or does anyone know of ways to represent irrational numbers on PCs by other means than an algorithm which spits out new digits every now and then ?). -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

Tim Peters

10:48 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

[Tim]

...

...
...
...
unicodedata.numeric(u"\N{PLANCK CONSTANT OVER TWO PI}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{EULER CONSTANT}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{AIRSPEED OF AFRICAN SWALLOW}") UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

[MAL]

...

Perhaps you should submit these for Unicode 4.0 ;-)

Note that the first two are already there; they just don't have an associated numerical value. The last one was a hint that I was trying to write a frivolous msg while giving my "<wink>" key a break <wink>.

...

But really, I don't suspect that anyone is going to do serious character to number conversion on these esoteric characters. Plain old digits will do just as they always have ...

Which is why I have to wonder whether there's *any* value in exposing the numeric-value property beyond regular old digits.

Neil Hodgson

29 Sep 29 Sep

3:58 a.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

[Tim]

...

Which is why I have to wonder whether there's *any* value in exposing the numeric-value property beyond regular old digits.

Running (in IDLE or PythonWin with a font that covers most of Unicode like Tahoma): import unicodedata for c in range(0x10000): x=unichr(c) try: b = unicodedata.numeric(x) #print "numeric:", repr(x) try: a = unicodedata.digit(x) if a != b: print "bad" , repr(x) except: print "Numeric but not digit", hex(c), x.encode("utf8"), "numeric ->", b except: pass Finds about 130 characters. The only ones I feel are worth worrying about are the half, quarters and eighths (0xbc, 0xbd, 0xbe, 0x215b, 0x215c, 0x215d, 0x215e) which are commonly used for expressing the prices of stocks and commodities in the US. This may be rarely used but it is better to have it available than to have people coding up their own translation tables. The 0x302* 'Hangzhou' numerals look like they should be classified as digits. Neil

M.-A. Lemburg

8:15 a.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

Neil Hodgson wrote:

...

The 0x302* 'Hangzhou' numerals look like they should be classified as digits.

Can't change the Unicode 3.0 database... so even though this might be useful in some contexts lets stick to the standard. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

Jeremy Hylton

2:09 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

...

...
...
...
...
"NH" == Neil Hodgson writes:

NH> Finds about 130 characters. The only ones I feel are worth NH> worrying about NH> are the half, quarters and eighths (0xbc, 0xbd, 0xbe, 0x215b, NH> 0x215c, 0x215d, 0x215e) which are commonly used for expressing NH> the prices of stocks and commodities in the US. This may be NH> rarely used but it is better to have it available than to have NH> people coding up their own translation tables. The US no longer uses fraction to report stock prices. Example: http://business.nytimes.com/market_summary.asp LEADERS Last Range Change AMERICAN INDL PPTYS REIT (IND) 14.06 13.56 - 14.06 0.25 / 1.81% R G S ENERGY GROUP INC (RGS) 28.19 27.50 - 28.19 0.50 / 1.81% DRESDNER RCM GLBL STRT INC (DSF) 6.63 6.63 - 6.63 0.06 / 0.95% FALCON PRODS INC (FCP) 9.63 9.63 - 9.88 0.06 / 0.65% GENERAL ELEC CO (GE) 59.00 58.63 - 59.75 0.19 / 0.32% Jeremy

Fredrik Lundh

7:01 a.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

tim wrote:

...

...
But really, I don't suspect that anyone is going to do serious character to number conversion on these esoteric characters. Plain old digits will do just as they always have ...

Which is why I have to wonder whether there's *any* value in exposing the numeric-value property beyond regular old digits.

the unicode database has three fields dealing with the numeric value: decimal digit value (integer), digit value (integer), and numeric value (integer *or* rational): "This is a numeric field. If the character has the numeric property, as specified in Chapter 4 of the Unicode Standard, the value of that character is represented with an integer or rational number in this field." here's today's proposal: let's claim that it's a bug to return a float from "numeric", and change it to return a string instead. (this will match "decomposition", which is also "broken" -- it really should return a tag followed by a sequence of unicode characters). </F>

M.-A. Lemburg

8:13 a.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

Fredrik Lundh wrote:

...

tim wrote:

...
...
But really, I don't suspect that anyone is going to do serious character to number conversion on these esoteric characters. Plain old digits will do just as they always have ...

Which is why I have to wonder whether there's *any* value in exposing the numeric-value property beyond regular old digits.

the unicode database has three fields dealing with the numeric value: decimal digit value (integer), digit value (integer), and numeric value (integer *or* rational):

"This is a numeric field. If the character has the numeric property, as specified in Chapter 4 of the Unicode Standard, the value of that character is represented with an integer or rational number in this field."

here's today's proposal: let's claim that it's a bug to return a float from "numeric", and change it to return a string instead.

Hmm, how about making the return format an option ? unicodedata.numeric(char, format=('float' (default), 'string', 'fraction'))

...

(this will match "decomposition", which is also "broken" -- it really should return a tag followed by a sequence of unicode characters).

Same here: unicodedata.decomposition(char, format=('string' (default), 'tuple')) I'd opt for making the API more customizable rather than trying to find the one and only true return format ;-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

M.-A. Lemburg

7:54 a.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

Tim Peters wrote:

...

[Tim]

...
...
...
...
unicodedata.numeric(u"\N{PLANCK CONSTANT OVER TWO PI}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{EULER CONSTANT}") Traceback (most recent call last): File "<stdin>", line 1, in ? ValueError: not a numeric character unicodedata.numeric(u"\N{AIRSPEED OF AFRICAN SWALLOW}") UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

[MAL]

...
Perhaps you should submit these for Unicode 4.0 ;-)

Note that the first two are already there; they just don't have an associated numerical value. The last one was a hint that I was trying to write a frivolous msg while giving my "<wink>" key a break <wink>.

That's what I meant: you should submit the numeric values for the first two and opt for addition of the last.

...

...
But really, I don't suspect that anyone is going to do serious character to number conversion on these esoteric characters. Plain old digits will do just as they always have ...

Which is why I have to wonder whether there's *any* value in exposing the numeric-value property beyond regular old digits.

It is needed for Unicode 3.0 standard compliance and for whoever wants to use this data. Since the Unicode database explicitly contains fractions, I think adding the .rational() API would make sense to provide a different access method to this data. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

Fredrik Lundh

28 Sep 28 Sep

9:49 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

tim wrote:

...

...
unicodedata.rational might be an obvious choice.

>>> unicodedata.rational(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

Perfect -- another great name. Beats all heck out of unicodedata.vulgar() too.

should I interpret this as a +1, or should I write a PEP on this topic? ;-) </F>

Tim Peters

10:32 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

[The Ping of Death suggests unicodedata.rational]

...

>>> unicodedata.rational(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

[Timmy replies]

...

Perfect -- another great name. Beats all heck out of unicodedata.vulgar() too.

[/F inquires]

...

should I interpret this as a +1, or should I write a PEP on this topic? ;-)

I'm on vacation (but too ill to do much besides alternate sleep & email <snarl>), and I'm not sure we have clear rules about how votes from commercial Python developers count when made on their own time. Perhaps a meta-PEP first to resolve that issue? Oh, all right, just speaking for myself, I'm +1 on The Ping of Death's name suggestion provided this function is needed at all. But not being a Unicode Guy by nature, I have no opinion on whether the function *is* needed (I understand how digits work in American English, and ord(ch)-ord('0') is the limit of my experience; can't say whether even the current .numeric() is useful for Klingons or Lawyers or whoever it is who expects to get a numeric value out of a character for 1/2 or 1/3).

M.-A. Lemburg

10:38 p.m.

New subject: unicodedata.numeric (was RE: stupid floating point question...)

Tim Peters wrote:

...

[The Ping of Death suggests unicodedata.rational]

...
>>> unicodedata.rational(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

[Timmy replies]

...
Perfect -- another great name. Beats all heck out of unicodedata.vulgar() too.

[/F inquires]

...
should I interpret this as a +1, or should I write a PEP on this topic? ;-)

I'm on vacation (but too ill to do much besides alternate sleep & email <snarl>), and I'm not sure we have clear rules about how votes from commercial Python developers count when made on their own time. Perhaps a meta-PEP first to resolve that issue?

Oh, all right, just speaking for myself, I'm +1 on The Ping of Death's name suggestion provided this function is needed at all. But not being a Unicode Guy by nature, I have no opinion on whether the function *is* needed (I understand how digits work in American English, and ord(ch)-ord('0') is the limit of my experience; can't say whether even the current .numeric() is useful for Klingons or Lawyers or whoever it is who expects to get a numeric value out of a character for 1/2 or 1/3).

The reason for "numeric" being available at all is that the UnicodeData.txt file format specifies such a field. I don't believe anyone will make serious use of it though... e.g. 2² would parse as 22 and not evaluate to 4. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

M.-A. Lemburg

10:30 p.m.

New subject: unicodedata.numeric (was RE: stupid floating pointquestion...)

Fredrik Lundh wrote:

...

tim wrote:

...
...
unicodedata.rational might be an obvious choice.

>>> unicodedata.rational(u"\N{VULGAR FRACTION ONE THIRD}") (1, 3)

Perfect -- another great name. Beats all heck out of unicodedata.vulgar() too.

should I interpret this as a +1, or should I write a PEP on this topic? ;-)

+1 from here. I really only chose floats to get all possibilities (digit, decimal and fractions) into one type... Python should support rational numbers some day. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

8609

Age (days ago)

8610

Last active (days ago)

List overview

Download

18 comments

7 participants

participants (7)

Fredrik Lundh
Jeremy Hylton
M.-A. Lemburg
Martin von Loewis
Neil Hodgson
The Ping of Death
Tim Peters

stupid floating point question...

Martin von Loewis

Tim Peters

Fredrik Lundh

The Ping of Death

Tim Peters

Fredrik Lundh

Tim Peters

M.-A. Lemburg

Tim Peters

Neil Hodgson

M.-A. Lemburg

Jeremy Hylton

Fredrik Lundh

M.-A. Lemburg

M.-A. Lemburg

Fredrik Lundh

Tim Peters

M.-A. Lemburg

M.-A. Lemburg

tags

participants (7)