
If anyone has an app known or suspected to be sensitive to dict timing, please try the patch here. Best I've been able to tell, it's a win. But it's a radical change in approach, so I don't want to rush it. This gets rid of the polynomial machinery entirely, along with the branches associated with updating the things, and the dictobject struct member holding the table's poly. Instead it relies on that i = (5*i + 1) % n is a full-period RNG whenever n is a power of 2 (that's what guarantees it will visit every slot), but perturbs that by adding in a few bits from the full hash code shifted right each time (that's what guarantees every bit of the hash code eventually influences the probe sequence, avoiding simple quadratic-time degenerate cases).

Tim Peters wrote:
Cool idea... rips out all that algebra garble and replaces it with random beauty :-) In any case, this will avoid use the trouble of having to check those poly numbers every time Intel decides to bump the register width by another factor of two ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/

M.-A. Lemburg <mal@lemburg.com>:
This seems unlikely. 2^64 = 18446744073709551616, which is roughly 10 ^ 22. Let's assume a memory density, of, say 2^20 machine words or roughly 8 megabytes per cubic centimeter (much, *much* better than we'll be able to do for the forseeable future -- remember power distribution and heat dissipation). Then, approximating the cubic relation between a sphere's volume and area by lopping off a power of four, we see that 2^64 64-bit words of memory would occupy a sphere of roughly 2^(64 - 20 - 2) cm radius, or about 17 million kilometers. This is roughly twice the diameter of the Sun. 64-bit computers aren't going to run out of address space any time soon. 64-bit clocks counting seconds will turn over in approximately six trillion years, long after the expansion of the Universe will have dropped its energy density low enough to make computation...well, let's just say "difficult" and leave it at that. Nobody needs 128 bits of integer or floating-point precision, either. There's basically no source of data to compute with that's got anywhere near 22 significant digits of accuracy -- 48 bits is about the most people in scientific computing ever use. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> [President Clinton] boasts about 186,000 people denied firearms under the Brady Law rules. The Brady Law has been in force for three years. In that time, they have prosecuted seven people and put three of them in prison. You know, the President has entertained more felons than that at fundraising coffees in the White House, for Pete's sake." -- Charlton Heston, FOX News Sunday, 18 May 1997

"Eric S. Raymond" wrote:
Where did you get those numbers from ? There are memory sticks with 128 MB around and these measure about 2.5 cm^2 * 1 mm.
Just you wait... someday marketing people will probably invent the world memory facility and start assigning a few hundred Terabytes for everyone on this planet to use for his/her data storage -- store once, use everywhere ;-) Let's assume we have 12e9 people on this planet by that time, then we'll need 12e9*100e12 = 1.2e24 bytes of central storage... or roughly 2^80 bytes per civilization. Of course, they will want to run Python in order to manage that data and so will all those Palm uses hooking up to the facility... ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/

M.-A. Lemburg <mal@lemburg.com>:
Remember power distribution and heat dissipation. You can't just figure volume of the memory ICs, you have to include power and cooling and structural support too. I eyeballed some DRAM modules I had lying around. In any case, my figures aren't that sensitive to memory density. If I'm off by a factor of 64 the diameter of the memory sphere unly drops by a factor of four (it's that cube-root relationship between volume and radius). So it's only half the radius of the Sun. That's still way, *way* more mass than all the planets in the Solar System put together.
Nah. Individual storage requirements would never get that large. Bill Joy did a study on this once and figured out that human beings can generate about 14GB of text during their lifetimes, max. In a system like the Web-on-steroids one you're supposing, higher-volume stuff like streaming video or Linux-kernel archives would be stored *once* with URLs pointing at them from peoples' individual stores. One terabyte (2^40) per person leaves plenty of headroom (two orders of magnitude larger). We could still handle a world population of 2^24 or roughly 16 billion people. (I think the size of the Library of Congress has been estimated at several thousand terabytes.) -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> I don't like the idea that the police department seems bent on keeping a pool of unarmed victims available for the predations of the criminal class. -- David Mohler, 1989, on being denied a carry permit in NYC

On Thu, May 31, 2001 at 04:43:32AM -0400, Eric S. Raymond wrote:
M.-A. Lemburg <mal@lemburg.com>:
This seems unlikely.
Why ? Bumping register size doesn't mean Intel expects to use it all as address space. They could be used for video-processing, or to represent a modest range of rationals <wink>, or to help core 'net routers deal with those nasty IPv6 addresses. I'm sure cryptomunchers would like bigger registers as well. Oh wait... I get it! You were trying to get yourself in the historybooks as the guy that said "64 bits ought to be enough for everyone" :-) -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

[Thomas Wouters]
Why ? Bumping register size doesn't mean Intel expects to use it all as address space. They could be used for video-processing,
Bingo. Common wisdom holds that vector machines are dead, but the truth is virtually *everyone* runs on a vector box now: Intel just renamed "vector" to "multimedia" (or AMD to "3D Now!"), and adopted a feeble (but ever-growing) subset of traditional vector machines' instruction sets.
or to represent a modest range of rationals <wink>, or to help core 'net routers deal with those nasty IPv6 addresses.
KSR's founders had in mind bit-level addressability of networks of machines spanning the globe. Were he to press the point, though, I'd have to agree with Eric that they didn't really *need* 128 bits for that modest goal.
I'm sure cryptomunchers would like bigger registers as well.
Agencies we can't talk about would like them as big as they can get them. Each vector register in a Cray box actually consisted of 64 64-bit words, or 4K bits per register. Some "special" models were constructed where the vector FPU was thrown away and additional bit-fiddling units added in its place: they really treated the vector registers as giant bitstrings, and didn't want to burn 64 clock cycles just to do, e.g., "one" conceptual xor.
That would be foolish indeed! 128, though, now *that's* surely enough for at least a decade <wink>.

Tim Peters <tim.one@home.com>:
You've got a point...but I don't think it's really economical to build that kind of hardware into general-purpose processors. You end up with a camel. You know, a horse designed by committee? -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> To make inexpensive guns impossible to get is to say that you're putting a money test on getting a gun. It's racism in its worst form. -- Roy Innis, president of the Congress of Racial Equality (CORE), 1988

[EAR]
You've got a point...
Well, really, they do -- but they had a much more compelling point when the Cold War came with an unlimited budget.
but I don't think it's really economical to build that kind of hardware into general-purpose processors.
Economical? The marginal cost of adding even nutso new features in silicon now for mass-market chips is pretty close to zero. Indeed, if you're in the speech recog or 3D imaging games (i.e., things that still tax a PC), Intel comes around *begging* for new ideas to use up all their chip real estate. The only one I recall them turning down was a request from Dragon's founder to add an instruction that, given x and y, returned log(exp(x)+exp(y)). They were skeptical, and turned out even *we* didn't need it <wink>.
You end up with a camel. You know, a horse designed by committee?
Yup! But that's the camel Intel rides to the bank, so it will probably grow more humps, on which to hang more bags of gold.

Heh. I was implementing 128-bit floats in software, for Cray, in about 1980. They didn't do it because they *wanted* to make the Cray boxes look like pigs <wink>. A 128-bit float type is simply necessary for some scientific work: not all problems are well-conditioned, and the "extra" bits can vanish fast. Went thru the same bit at KSR. Just yesterday Konrad Hinsen was worrying on c.l.py that his scripts that took 2 hours using native floats zoomed to 5 days when he started using GMP's arbitrary-precision float type *just* to get 100 bits of precision. When KSR died, the KSR-3 on the drawing board had 128-bit registers. I was never quite sure why the founders thought that would be a killer selling point, but it wasn't for floats. Down in the trenches we thought it would be mondo cool to have an address space so large that for the rest of our lives we'd never need to bother calling free() again <0.8 wink>.

Tim Peters <tim.one@home.com>:
Makes me wonder how competent your customers' numerical analysts were. Where the heck did they think they were getting data with that many digits of accuracy? (Note that I didn't say "precision"...) -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> Strict gun laws are about as effective as strict drug laws...It pains me to say this, but the NRA seems to be right: The cities and states that have the toughest gun laws have the most murder and mayhem. -- Mike Royko, Chicago Tribune

[Tim]
A 128-bit float type is simply necessary for some scientific work: not all problems are well-conditioned, and the "extra" bits can vanish fast.
[ESR]
Not all scientific work consists of predicting the weather with inputs known to half a digit on a calm day <wink>. Knuth gives examples of ill-conditioned problems where resorting to unbounded rationals is faster than any known stable f.p. approach (stuck with limited precision) -- think, e.g., chaotic systems here, which includes parts of many hydrodynamics problems in real life. Some scientific work involves modeling ab initio across trillions of computations (and on a Cray box in particular, where addition didn't even bother to round, nor multiplication bother to compute the full product tree, the error bounds per operation were much worse than in a 754 world). You shouldn't overlook either that algorithms often needed massive rewriting to exploit vector and parallel architectures, and in a world where a supremely competent numerical analysis can take a month to verify the numerical robustness of a new algorithm covering two pages of Fortran, a million lines of massively reworked seat-of-the-pants modeling code couldn't be trusted at all without running it under many conditions in at least two precisions (it only takes one surprise catastrophic cancellation to destroy everything). A major oil company once threatened to sue Cray when their reservoir model produced wildly different results under a new release of the compiler. Some exceedingly sharp analysts worked on that one for a solid week. Turned out the new compiler evaluated a subexpression A*B*C by doing (B*C) first instead of (A*B), because it was faster in context (and fine to do so by Fortran's rules). It so happened A was very large, and B and C both small, and doing B*C first caused the whole product to underflow to zero where doing A*B first left a product of roughly C's magnitude. I can't imagine how they ever would have found this if they weren't able to recompile the code using twice the precision (which worked fine thanks to the larger dynamic range), then tracing to see where the runs diverged. Even then it took a week because this was 100s of thousands of lines of crufty Fortran than ran for hours on the world's then-fastest machine before delivering bogus results. BTW, if you think the bulk of the world's numeric production code has even been *seen* by a qualified numerical analyst, you should ride on planes more often <wink>.

Tim Peters <tim.one@home.com>:
Hmmm...good answer. I still believe it's the case that real-world measurements max out below 48 bits or so of precision because the real world is a noisy, fuzzy place. But I can see that most of the algorithms for partial differential equationss would multiply those by very small or very large quantities repeatedly. The range-doubling trick for catching divergences is neat, too. So maybe there's a market for 128-bit floats after all. I'm still skeptical about how likely those applications are to influence the architecture of general-purpose processors. I saw a study once that said heavy-duty scientific floating point only accounts for about 2% of the computing market -- and I think it's significant that MMX instructions and so forth entered the Intel line to support *games*, not Navier-Stokes calculations. That 2% will have to get a lot bigger before I can see Intel doubling its word size again. It's not just the processor design; the word size has huge implications for buses, memory controllers, and the whole system architecture. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> The United States is in no way founded upon the Christian religion -- George Washington & John Adams, in a diplomatic message to Malta.

"Eric S. Raymond" <esr@thyrsus.com>:
But when version 1.0 of FlashFlood! comes out, requiring high-quality real-time hydrodynamics simulation, Navier-Stokes calculations will suddenly become very important... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

[Eric S. Raymond]
... So maybe there's a market for 128-bit floats after all.
I think very small. There's a much larger market for 128-bit float *registers*, though -- in the "treat it as 2 64-bit, or 4 32-bit, floats, and operate on them in parallel" sense. That's the baby vector register view, and is already happening.
Heh. I used to wonder about that, but not any more: games may have no more than entertainment (sometimes disguised as education <wink>) in mind, but what do the latest & greatest games do? Strive to simulate physical reality (sometimes with altered physical laws), just as closely as possible. Whether it's ray-tracing, effective motion-compression, or N-body simulations, games are easily as demanding as what computational chemists do. A difference is that general-purpose *compilers* aren't being taught how to use these "new" architectural gimmicks. All that new hardware sits unused unless you've got an app dipping into assembler, or into a hand-coded utility library written in assembler. The *general* market for pure floating-point can barely support what's left of the supercomputer industry anymore (btw, Cray never became a billion-dollar company even in its heyday, and what's left of them gets passed around for peanuts now).
Intel is just now getting its foot wet with with 64-bit boxes. That was old news to me 20 years ago. All I hope to see 20 years from now is that somewhere along the way I got smart enough to drop computers and get a real life <wink>. by-then-the-whole-system-will-exist-in-the-superposition-of-a- single-plutonium-atom's-states-anyway-ly y'rs - tim

Another version of the patch attached, a bit faster and with a large new comment block explaining it. It's looking good! As I hope the new comments make clear, nothing about this approach is "a mystery" -- there are explainable reasons for each fiddly bit. This gives me more confidence in it than in the previous approach, and, indeed, it turned out that when I *thought* "hmm! I bet this change would be a little faster!", it actually was <wink>.

Tim Peters wrote:
Thanks a lot for this nice patch. It looks like a real improvement. Also thanks for mentioning my division idea. Since all bits of the hash are eventually taken into account, this idea has somehow survived in an even more efficient solution, good end, file closed. (and good that I saved the time to check my patch in, lately :-) cheers - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/

Tim Peters wrote:
Cool idea... rips out all that algebra garble and replaces it with random beauty :-) In any case, this will avoid use the trouble of having to check those poly numbers every time Intel decides to bump the register width by another factor of two ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/

M.-A. Lemburg <mal@lemburg.com>:
This seems unlikely. 2^64 = 18446744073709551616, which is roughly 10 ^ 22. Let's assume a memory density, of, say 2^20 machine words or roughly 8 megabytes per cubic centimeter (much, *much* better than we'll be able to do for the forseeable future -- remember power distribution and heat dissipation). Then, approximating the cubic relation between a sphere's volume and area by lopping off a power of four, we see that 2^64 64-bit words of memory would occupy a sphere of roughly 2^(64 - 20 - 2) cm radius, or about 17 million kilometers. This is roughly twice the diameter of the Sun. 64-bit computers aren't going to run out of address space any time soon. 64-bit clocks counting seconds will turn over in approximately six trillion years, long after the expansion of the Universe will have dropped its energy density low enough to make computation...well, let's just say "difficult" and leave it at that. Nobody needs 128 bits of integer or floating-point precision, either. There's basically no source of data to compute with that's got anywhere near 22 significant digits of accuracy -- 48 bits is about the most people in scientific computing ever use. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> [President Clinton] boasts about 186,000 people denied firearms under the Brady Law rules. The Brady Law has been in force for three years. In that time, they have prosecuted seven people and put three of them in prison. You know, the President has entertained more felons than that at fundraising coffees in the White House, for Pete's sake." -- Charlton Heston, FOX News Sunday, 18 May 1997

"Eric S. Raymond" wrote:
Where did you get those numbers from ? There are memory sticks with 128 MB around and these measure about 2.5 cm^2 * 1 mm.
Just you wait... someday marketing people will probably invent the world memory facility and start assigning a few hundred Terabytes for everyone on this planet to use for his/her data storage -- store once, use everywhere ;-) Let's assume we have 12e9 people on this planet by that time, then we'll need 12e9*100e12 = 1.2e24 bytes of central storage... or roughly 2^80 bytes per civilization. Of course, they will want to run Python in order to manage that data and so will all those Palm uses hooking up to the facility... ;-) -- Marc-Andre Lemburg CEO eGenix.com Software GmbH ______________________________________________________________________ Company & Consulting: http://www.egenix.com/ Python Software: http://www.lemburg.com/python/

M.-A. Lemburg <mal@lemburg.com>:
Remember power distribution and heat dissipation. You can't just figure volume of the memory ICs, you have to include power and cooling and structural support too. I eyeballed some DRAM modules I had lying around. In any case, my figures aren't that sensitive to memory density. If I'm off by a factor of 64 the diameter of the memory sphere unly drops by a factor of four (it's that cube-root relationship between volume and radius). So it's only half the radius of the Sun. That's still way, *way* more mass than all the planets in the Solar System put together.
Nah. Individual storage requirements would never get that large. Bill Joy did a study on this once and figured out that human beings can generate about 14GB of text during their lifetimes, max. In a system like the Web-on-steroids one you're supposing, higher-volume stuff like streaming video or Linux-kernel archives would be stored *once* with URLs pointing at them from peoples' individual stores. One terabyte (2^40) per person leaves plenty of headroom (two orders of magnitude larger). We could still handle a world population of 2^24 or roughly 16 billion people. (I think the size of the Library of Congress has been estimated at several thousand terabytes.) -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> I don't like the idea that the police department seems bent on keeping a pool of unarmed victims available for the predations of the criminal class. -- David Mohler, 1989, on being denied a carry permit in NYC

On Thu, May 31, 2001 at 04:43:32AM -0400, Eric S. Raymond wrote:
M.-A. Lemburg <mal@lemburg.com>:
This seems unlikely.
Why ? Bumping register size doesn't mean Intel expects to use it all as address space. They could be used for video-processing, or to represent a modest range of rationals <wink>, or to help core 'net routers deal with those nasty IPv6 addresses. I'm sure cryptomunchers would like bigger registers as well. Oh wait... I get it! You were trying to get yourself in the historybooks as the guy that said "64 bits ought to be enough for everyone" :-) -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

[Thomas Wouters]
Why ? Bumping register size doesn't mean Intel expects to use it all as address space. They could be used for video-processing,
Bingo. Common wisdom holds that vector machines are dead, but the truth is virtually *everyone* runs on a vector box now: Intel just renamed "vector" to "multimedia" (or AMD to "3D Now!"), and adopted a feeble (but ever-growing) subset of traditional vector machines' instruction sets.
or to represent a modest range of rationals <wink>, or to help core 'net routers deal with those nasty IPv6 addresses.
KSR's founders had in mind bit-level addressability of networks of machines spanning the globe. Were he to press the point, though, I'd have to agree with Eric that they didn't really *need* 128 bits for that modest goal.
I'm sure cryptomunchers would like bigger registers as well.
Agencies we can't talk about would like them as big as they can get them. Each vector register in a Cray box actually consisted of 64 64-bit words, or 4K bits per register. Some "special" models were constructed where the vector FPU was thrown away and additional bit-fiddling units added in its place: they really treated the vector registers as giant bitstrings, and didn't want to burn 64 clock cycles just to do, e.g., "one" conceptual xor.
That would be foolish indeed! 128, though, now *that's* surely enough for at least a decade <wink>.

Tim Peters <tim.one@home.com>:
You've got a point...but I don't think it's really economical to build that kind of hardware into general-purpose processors. You end up with a camel. You know, a horse designed by committee? -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> To make inexpensive guns impossible to get is to say that you're putting a money test on getting a gun. It's racism in its worst form. -- Roy Innis, president of the Congress of Racial Equality (CORE), 1988

[EAR]
You've got a point...
Well, really, they do -- but they had a much more compelling point when the Cold War came with an unlimited budget.
but I don't think it's really economical to build that kind of hardware into general-purpose processors.
Economical? The marginal cost of adding even nutso new features in silicon now for mass-market chips is pretty close to zero. Indeed, if you're in the speech recog or 3D imaging games (i.e., things that still tax a PC), Intel comes around *begging* for new ideas to use up all their chip real estate. The only one I recall them turning down was a request from Dragon's founder to add an instruction that, given x and y, returned log(exp(x)+exp(y)). They were skeptical, and turned out even *we* didn't need it <wink>.
You end up with a camel. You know, a horse designed by committee?
Yup! But that's the camel Intel rides to the bank, so it will probably grow more humps, on which to hang more bags of gold.

Heh. I was implementing 128-bit floats in software, for Cray, in about 1980. They didn't do it because they *wanted* to make the Cray boxes look like pigs <wink>. A 128-bit float type is simply necessary for some scientific work: not all problems are well-conditioned, and the "extra" bits can vanish fast. Went thru the same bit at KSR. Just yesterday Konrad Hinsen was worrying on c.l.py that his scripts that took 2 hours using native floats zoomed to 5 days when he started using GMP's arbitrary-precision float type *just* to get 100 bits of precision. When KSR died, the KSR-3 on the drawing board had 128-bit registers. I was never quite sure why the founders thought that would be a killer selling point, but it wasn't for floats. Down in the trenches we thought it would be mondo cool to have an address space so large that for the rest of our lives we'd never need to bother calling free() again <0.8 wink>.

Tim Peters <tim.one@home.com>:
Makes me wonder how competent your customers' numerical analysts were. Where the heck did they think they were getting data with that many digits of accuracy? (Note that I didn't say "precision"...) -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> Strict gun laws are about as effective as strict drug laws...It pains me to say this, but the NRA seems to be right: The cities and states that have the toughest gun laws have the most murder and mayhem. -- Mike Royko, Chicago Tribune

[Tim]
A 128-bit float type is simply necessary for some scientific work: not all problems are well-conditioned, and the "extra" bits can vanish fast.
[ESR]
Not all scientific work consists of predicting the weather with inputs known to half a digit on a calm day <wink>. Knuth gives examples of ill-conditioned problems where resorting to unbounded rationals is faster than any known stable f.p. approach (stuck with limited precision) -- think, e.g., chaotic systems here, which includes parts of many hydrodynamics problems in real life. Some scientific work involves modeling ab initio across trillions of computations (and on a Cray box in particular, where addition didn't even bother to round, nor multiplication bother to compute the full product tree, the error bounds per operation were much worse than in a 754 world). You shouldn't overlook either that algorithms often needed massive rewriting to exploit vector and parallel architectures, and in a world where a supremely competent numerical analysis can take a month to verify the numerical robustness of a new algorithm covering two pages of Fortran, a million lines of massively reworked seat-of-the-pants modeling code couldn't be trusted at all without running it under many conditions in at least two precisions (it only takes one surprise catastrophic cancellation to destroy everything). A major oil company once threatened to sue Cray when their reservoir model produced wildly different results under a new release of the compiler. Some exceedingly sharp analysts worked on that one for a solid week. Turned out the new compiler evaluated a subexpression A*B*C by doing (B*C) first instead of (A*B), because it was faster in context (and fine to do so by Fortran's rules). It so happened A was very large, and B and C both small, and doing B*C first caused the whole product to underflow to zero where doing A*B first left a product of roughly C's magnitude. I can't imagine how they ever would have found this if they weren't able to recompile the code using twice the precision (which worked fine thanks to the larger dynamic range), then tracing to see where the runs diverged. Even then it took a week because this was 100s of thousands of lines of crufty Fortran than ran for hours on the world's then-fastest machine before delivering bogus results. BTW, if you think the bulk of the world's numeric production code has even been *seen* by a qualified numerical analyst, you should ride on planes more often <wink>.

Tim Peters <tim.one@home.com>:
Hmmm...good answer. I still believe it's the case that real-world measurements max out below 48 bits or so of precision because the real world is a noisy, fuzzy place. But I can see that most of the algorithms for partial differential equationss would multiply those by very small or very large quantities repeatedly. The range-doubling trick for catching divergences is neat, too. So maybe there's a market for 128-bit floats after all. I'm still skeptical about how likely those applications are to influence the architecture of general-purpose processors. I saw a study once that said heavy-duty scientific floating point only accounts for about 2% of the computing market -- and I think it's significant that MMX instructions and so forth entered the Intel line to support *games*, not Navier-Stokes calculations. That 2% will have to get a lot bigger before I can see Intel doubling its word size again. It's not just the processor design; the word size has huge implications for buses, memory controllers, and the whole system architecture. -- <a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a> The United States is in no way founded upon the Christian religion -- George Washington & John Adams, in a diplomatic message to Malta.

"Eric S. Raymond" <esr@thyrsus.com>:
But when version 1.0 of FlashFlood! comes out, requiring high-quality real-time hydrodynamics simulation, Navier-Stokes calculations will suddenly become very important... Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+

[Eric S. Raymond]
... So maybe there's a market for 128-bit floats after all.
I think very small. There's a much larger market for 128-bit float *registers*, though -- in the "treat it as 2 64-bit, or 4 32-bit, floats, and operate on them in parallel" sense. That's the baby vector register view, and is already happening.
Heh. I used to wonder about that, but not any more: games may have no more than entertainment (sometimes disguised as education <wink>) in mind, but what do the latest & greatest games do? Strive to simulate physical reality (sometimes with altered physical laws), just as closely as possible. Whether it's ray-tracing, effective motion-compression, or N-body simulations, games are easily as demanding as what computational chemists do. A difference is that general-purpose *compilers* aren't being taught how to use these "new" architectural gimmicks. All that new hardware sits unused unless you've got an app dipping into assembler, or into a hand-coded utility library written in assembler. The *general* market for pure floating-point can barely support what's left of the supercomputer industry anymore (btw, Cray never became a billion-dollar company even in its heyday, and what's left of them gets passed around for peanuts now).
Intel is just now getting its foot wet with with 64-bit boxes. That was old news to me 20 years ago. All I hope to see 20 years from now is that somewhere along the way I got smart enough to drop computers and get a real life <wink>. by-then-the-whole-system-will-exist-in-the-superposition-of-a- single-plutonium-atom's-states-anyway-ly y'rs - tim

Another version of the patch attached, a bit faster and with a large new comment block explaining it. It's looking good! As I hope the new comments make clear, nothing about this approach is "a mystery" -- there are explainable reasons for each fiddly bit. This gives me more confidence in it than in the previous approach, and, indeed, it turned out that when I *thought* "hmm! I bet this change would be a little faster!", it actually was <wink>.

Tim Peters wrote:
Thanks a lot for this nice patch. It looks like a real improvement. Also thanks for mentioning my division idea. Since all bits of the hash are eventually taken into account, this idea has somehow survived in an even more efficient solution, good end, file closed. (and good that I saved the time to check my patch in, lately :-) cheers - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net/ 14163 Berlin : PGP key -> http://wwwkeys.pgp.net/ PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com/
participants (8)
-
Christian Tismer
-
Eric S. Raymond
-
Fred L. Drake, Jr.
-
Gordon McMillan
-
Greg Ewing
-
M.-A. Lemburg
-
Thomas Wouters
-
Tim Peters