The reliability of python threads

Nick Maclaren nmm1 at cus.cam.ac.uk
Thu Jan 25 20:36:39 CET 2007


In article <1169751828.986583.47200 at j27g2000cwj.googlegroups.com>,
"Paddy" <paddy3118 at netscape.net> writes:
|> 
|> > |> Three to four months before `strange errors`? I'd spend some time
|> > |> correlating logs; not just for your program, but for everything running
|> > |> on the server. Then I'd expect to cut my losses and arrange to safely
|> > |> re-start the program every TWO months.
|> > |> (I'd arrange the re-start after collecting logs but before their
|> > |> analysis. Life is too short).
|> >
|> > Forget it.  That strategy is fine in general, but is a waste of time
|> > where threading issues are involved (or signal handling, or some types
|> > of communication problem, for that matter).
|> 
|> Nah, Its a great strategy. it keeps you up and running when all you
|> know for sure is that you will most likely be able to keep things
|> together for three months normally.
|> 
|> The OP only thinks its a threading problem - it doesn't matter what the
|> true fix will be, as long as arranging to re-start the server well
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|> before its likely to go down doesn't take too long, compared to your
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|> exploration of the problem, and, of course, you have to be able to
|> afford the glitch in availability.

Consider the marked phrase in the context of a Poisson process failure
model, and laugh.  If you don't understand why I say that, I suggest
finding out the properties of the Poisson process!


Regards,
Nick Maclaren.



More information about the Python-list mailing list