Python "why" questions

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Thu Aug 19 14:04:49 EDT 2010


On Tue, 17 Aug 2010 19:15:54 -0700, Russ P. wrote:

> The convention of starting with zero may have had some slight
> performance advantage in the early days of computing, but the huge
> potential for error that it introduced made it a poor choice in the long
> run, at least for high-level languages.

People keep saying this, but it's actually the opposite. Signpost errors 
and off-by-one errors are more common in languages that count from one.

A simple example: Using zero-based indexing, suppose you want to indent 
the string "spam" so it starts at column 4. How many spaces to you 
prepend?

0123456789
    spam

Answer: 4. Nice and easy and almost impossible to get wrong. To indent to 
position n, prepend n spaces.

Now consider one-based indexing, where the string starts at column 5:

1234567890
    spam

Answer: 5-1 = 4. People are remarkably bad at remembering to subtract the 
1, hence the off-by-one errors.

Zero-based counting doesn't entirely eliminate off-by-one errors, but the 
combination of that plus half-open on the right intervals reduces them as 
much as possible.

The intuitive one-based closed interval notation used in many natural 
languages is terrible for encouraging off-by-one errors. Quick: how many 
days are there between Friday 20th September and Friday 27th September 
inclusive? If you said seven, you fail.

One-based counting is the product of human intuition. Zero-based counting 
is the product of human reason.


-- 
Steven



More information about the Python-list mailing list