Are the critiques in "All the things I hate about Python" valid?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Mon Feb 26 22:25:35 EST 2018
On Mon, 26 Feb 2018 17:18:38 -0800, Rick Johnson wrote:
[...]
> So, for instance: if your birthday is January 25th 1969, the last second
> of the last day of your _first_ year is January 24th 1970 @ 11:59:59PM.
> And the last second of the last day of your _second_ year is January
> 24th 1971 @ 11:59:59PM. And so forth...
>
> Does this make sense?
Indeed it does, and frankly, the Racehorse scheme is better.
At least with the Racehorse scheme, you only need to update the database
once a year, at midnight on the new year, which hopefully is the quietest
time of the year for you. You can lock access to the database, run the
update, and hopefully be up and running again before anyone notices
anything other than a minor outage.
With your scheme, well, I can think of a few ways to do it, none of which
are good. A database expert might be able to think of some better ideas,
but you might:
1. Run a separate scheduled job for each record, which does nothing but
advance the age by one at a certain time, then sleep for a year. If you
have ten million records, you need ten million scheduled jobs; I doubt
many scheduling systems can cope with that many jobs. (But I welcome
correction.)
Also, few scheduling systems guarantee that jobs will execute at
*precisely* the time you expect. If the system is down at the time the
job was scheduled to run, they may never run at all. So there is likely
to be a lag between when you want the records updated, and when they
actually are updated.
No, using scheduled jobs is fragile, and expensive.
Plan 2: have a single job that does nothing but scan the database,
continuously in a loop, and if a record's birthdate is more than a year
in the past, and hasn't been updated in the last year, update the age by
one.
Actually, I think this sucks worse than the ten-million-scheduled-jobs
idea. Hopefully it will be obvious why this idea is so awful.
Plan 3: have a trigger that runs whenever a record is queried or
accessed. If the birthdate is more than a year in the past, and it has
been more than a year since the last access, then update the age.
This at least doesn't *entirely* suck. But if you're going to go to the
trouble of doing this on *every* access to the record, isn't it simpler
to just make the age a computed field that calculates the age when needed?
The cost of computing the age is not that expensive, especially if you
store the birthdate in seconds. It's just a subtraction, maybe followed
by a division if you want the age in years. It hardly seems worthwhile
storing the age as a pre-computed integer if you then need a cunning
scheme to possibly update that integer on every access to the record.
I think that Rick's "optimization" here is a perfect example of
pessimisation (making a program slower in the mistaken belief that you're
making it faster). To quote W.A. Wulf:
"More computing sins are committed in the name of efficiency (without
necessarily achieving it) than for any other single reason — including
blind stupidity."
--
Steve
More information about the Python-list
mailing list