Why less emphasis on private data?

Hendrik van Rooyen mail at microcorp.co.za
Wed Jan 10 02:33:05 EST 2007


"Steven D'Aprano" <steve at REMOVE.THIS.cybersource.com.au> wrote:

> On Tue, 09 Jan 2007 10:27:56 +0200, Hendrik van Rooyen wrote:
>
> > "Steven D'Aprano" <steve at REMOVE.THIS.cybersource.com.au> wrote:
> >
> >
> >> On Mon, 08 Jan 2007 13:11:14 +0200, Hendrik van Rooyen wrote:
> >>
> >> > When you hear a programmer use the word "probability" -
> >> > then its time to fire him, as in programming even the lowest
> >> > probability is a certainty when you are doing millions of
> >> > things a second.
> >>
> >> That is total and utter nonsense and displays the most appalling
> >> misunderstanding of probability, not to mention a shocking lack of common
> >> sense.
> >
> > Really?
> >
> > Strong words.
> >
> > If you don't understand you need merely ask, so let me elucidate:
> >
> > If there is some small chance of something occurring at run time that can
> > cause code to fail - a "low probability" in all the accepted senses of the
> > word - and a programmer declaims - "There is such a low probability of
> > that occurring and its so difficult to cater for that I won't bother"
> > - then am I supposed to congratulate him on his wisdom and outstanding
> > common sense?
> >
> > Hardly. - If anything can go wrong, it will. - to paraphrase Murphy's law.
> >
> > To illustrate:
> > If there is one place in any piece of code that is critical and not
protected,
> > even if its in a relatively rarely called routine, then because of the high
> > speed of operations, and the fact that time is essentially infinite,
>
> Time is essentially infinite? Do you really expect your code will still be
> in use fifty years from now, let alone a billion years?

My code does not suffer from bit rot, so it should outlast the hardware...

But seriously - for the sort of mistakes we make as programmers - it does
not actually need infinite time for the lightning to strike - most things that
will actually run overnight are probably stable - and if it takes say a week
of running for the bug to raise its head - it is normally a very difficult
problem to find and fix. A case in point - One of my first postings to
this newsgroup concerned an intermittent failure on a serial port - It was
never resolved in a satisfactory manner - eventually I followed my gut
feel, made some changes, and it seems to have gone away - but I expect
it to bite me anytime - I don't actually *know* that its fixed, and there is
not, as a corollary to your sum below here, any real way to know for
certain.

>
> I know flowcharts have fallen out of favour in IT, and rightly so -- they
> don't model modern programming techniques very well, simply because modern
> programming techniques would lead to a chart far too big to be practical.

I actually like drawing data flow diagrams, even if they are sketchy, primitive
ones, to try to model the inter process communications (where a "process"
may be just a python thread) - I find it useful to keep an overall perspective.

> But for the sake of the exercise, imagine a simplified flowchart of some
> program, one with a mere five components, such that one could take any of
> the following paths through the program:
>
> START -> A -> B -> C -> D -> E
> START -> A -> C -> B -> D -> E
> START -> A -> C -> D -> B -> E
> ...
> START -> E -> D -> C -> B -> A
>
> There are 5! (five factorial) = 120 possible paths through the program.
>
> Now imagine one where there are just fifty components, still quite a
> small program, giving 50! = 3e64 possible paths. Now suppose that there is
> a bug that results from following just one of those paths. That would
> match your description of "lowest probability" -- any lower and it would
> be zero.
>
> If all of the paths are equally likely to be taken, and the program takes
> a billion different paths each millisecond, on average it would take about
> 1.5e55 milliseconds to hit the bug -- or about 5e44 YEARS of continual
> usage. If every person on Earth did nothing but run this program 24/7, it
> would still take on average almost sixty million billion billion billion
> years to discover the bug.

In something with just 50 components it is, I believe, better to try to
inspect the quality in, than to hope that random testing will show up
errors - But I suppose this is all about design, and about avoiding
doing known no - nos.

>
> But of course in reality some paths are more likely than others. If the
> bug happens to exist in a path that is executed often, or if it exists
> in many paths, then the bug will be found quickly. On the other hand, if
> the bug is in a path that is rarely executed, your buggy program may be
> more reliable than the hardware you run it on. (Cynics may say that isn't
> hard.)

Oh I am of the opposite conviction - Like the fellow of the Circuit Cellar
I forget his name ( Steve Circia (?) ) who said:  "My favourite Programming
Language is Solder"... I find that when I start blaming the hardware
for something that is going wrong, I am seldom right...

And this is true also for hardware that we make ourselves, that one would
expect to be buggy, because it is new and untested.  It is almost as if the
tools used in hardware design are somehow less buggy than a programmer's
fumbling attempts at producing something logical.

>
> You're project manager for the development team. Your lead developer tells
> you that he knows this bug exists (never mind how, he's very clever) and
> that the probability of reaching that bug in use is about 3e-64.

This is too convenient - This lead developer is about as likely as
my infinite time...

>
> If it were easy to fix, the developer wouldn't even have mentioned it.
> This is a really hard bug to fix, it's going to require some major
> changes to the program, maybe even a complete re-think of the program.
> Removing this bug could even introduce dozens, hundreds of new bugs.
>
> So okay Mister Project Manager. What do you do? Do you sack the developer,
> like you said? How many dozens or hundreds of man-hours are you prepared
> to put into this? If the money is coming out of your pocket, how much are
> you willing to spend to fix this bug?
>

Do a design review, Put in a man with some experience,
and hope for the best - in reality what else can you do, short
of trying to do it all yourself?

>
> [snip]
>
> > How is this a misunderstanding of probability?  - probability applies to
> > any one trial, so in a series of trials, when the number of trials is
> > large enough - in the
> > order of the inverse of the probability, then ones expectation must be
> > that the rare occurrence should occur...
>
> "Even the lowest probability is a certainty" is mathematically nonsense:
> it just isn't true -- no matter how many iterations, the probability is
> always a little less than one. And you paper over a hole in your argument
> with "when the number of trials is large enough" -- if the probability is
> small enough, "large enough" could be unimaginably huge indeed.

*grin* sure - this is not the maths tripos...

But I am willing to lay a bet, that over an evening's play at roulette, the
red will come up at least once.  I would expect to win too.

>
> Or, to put it another way, while anything with a non-zero probability
> _might_ happen (you might drop a can of soft drink on your computer,
> shorting it out and _just by chance_ causing it to fire off a perfectly
> formatted email containing a poem about penguins) we are justified in
> writing off small enough probabilities as negligible. It's not that they
> can't happen, but the chances of doing so are so small that we can rightly
> expect to never see them happen.

I promise I won't hold my breath...

<joke>
Man inspecting the work of a bunch of monkeys with Keyboards:

"Hey Harry - I think we might have something here - check this:

To be, or not to be, that is the iuuiihiuweriopuqewt"

<end joke>

>
> You might like to read up on Borel's "Law" (not really a law at all,
> really just a heuristic for judging when probabilities are negligible).
> Avoid the nonsense written about Borel and his guideline by Young Earth
> Creationists, they have given him an undeserved bad name.
>
> http://www.talkorigins.org/faqs/abioprob/borelfaq.html
>
ok will have a look later

8<--------------

> > Now how does all this show a shocking lack of common sense?
>
> You pay no attention to the economics of programming. Programming doesn't
> come for free. It is always a trade-off for the best result with the least
> effort. Any time people start making absolute claims about fixing every
> possible bug, no matter how obscure or unlikely or how much work it will
> take, I know that they aren't paying for the work to be done.

Too much assumption from too little data.  Have actually been the part owner
of a small company for the last two decades or so  - I am paying, all right,
I am paying, and paying...

Which maybe is why I want perfection...

- Hendrik





More information about the Python-list mailing list