On 12 October 2016 at 11:16, Steven D'Aprano <steve@pearwood.info> wrote:
On Wed, Oct 12, 2016 at 12:25:16AM +0000, Elliot Gorokhovsky wrote:
Regarding generalization: the general technique for special-casing is you just substitute all type checks with 1 or 0 by applying the type assumption you're making. That's the only way to guarantee it's safe and compliant.
I'm confused -- I don't understand how *removing* type checks can possible guarantee the code is safe and compliant.
It's all very well and good when you are running tests that meet your type assumption, but what happens if they don't? If I sort a list made up of (say) mixed int and float (possibly including subclasses), does your "all type checks are 1 or 0" sort segfault? If not, why not? Where's the safety coming from?
My understanding is that the code does a per-check that all the elements of the list are the same type (float, for example). This is a relatively quick test (O(n) pointer comparisons). If everything *is* a float, then an optimised comparison routine that skips all the type checks and goes straight to a float comparison (single machine op). Because there are more than O(n) comparisons done in a typical sort, this is a win. And because the type checks needed in rich comparison are much more expensive than a pointer check, it's a *big* win. What I'm *not* quite clear on is why Python 3's change to reject comparisons between unrelated types makes this optimisation possible. Surely you have to check either way? It's not that it's a particularly important question - if the optimisation works, it's not that big a deal what triggered the insight. It's just that I'm not sure if there's some other point that I've not properly understood. Paul