Does zero based indexing drive anyone else crazy?
Hi Numpeans, I have been working on a web-based scientific application for about a year, most of which had been written in either Matlab or SPLUS/R. My task has been to make it "driveable" through an online interface (if anyone cares about mortality forecasting, drop me an email and we can chat about it offline). I chose Python/Numpy for the language because Python and Numpy are both so full featured and easy to work with (except for one little thing...), and neither Matlab nor R could gracefully deal with CGI programming (misguided propaganda notwithstanding). However.... I have spent a huge amount of my time fixing and bending my head around off-by-one errors caused by trying to index matrices using 0 to n-1. The problem is two-fold (threefold if you count my limited IQ...): one, all the formulas in the literature use 1 to n indexing except for some small exceptions. Second and more important, it is far more natural to program if the indices are aligned with the counts of the elements (I think there is a way to express that idea in modern algebra but I can't recall it). This lets you say "how many are there? Three--ok, grab the third one and do whatever to it" etc. Or "how many? zero--ok don't do anything". With zero-based indexing, you are always translating between counts and indices, but such translation is never a problem in one-based indexing. Given the long history of python and its ancestry in C (for which zero based indexing made lots of sense since it dovetailed with thinking in memory offsets in systems programming), there is probably nothing to be done now. I guess I just want to vent, but also to ask if anyone has found any way to deal with this issue in their own scientific programming. Or maybe I am the only with this problem, and if I were a real programmer would translate into zero indexing without even noticing.... Anyway, thanks for listening...
On Sun, 2 Jul 2006, Webb Sprague apparently wrote:
I have spent a huge amount of my time fixing and bending my head around off-by-one errors caused by trying to index matrices using 0 to n-1.
I come from GAUSS so I am symphathetic, but in the end zero-based indexing is usually great. Anyway, ideally you will rely on vector/matrix operations rather than constantly tracking indices. fwiw, Alan
On Sun, 2 Jul 2006 16:36:14 -0700
"Webb Sprague"
Given the long history of python and its ancestry in C (for which zero based indexing made lots of sense since it dovetailed with thinking in memory offsets in systems programming), there is probably nothing to be done now. I guess I just want to vent, but also to ask if anyone has found any way to deal with this issue in their own scientific programming.
the mathemaician John Conway, in his book, "the book of numbers" has a brilliant discussion of just this issue. It does indeed make sense mathematically. Simon.
Webb Sprague wrote:
it is far more natural to program if the indices are aligned with the counts of the elements
I suggest that it's only more natural if that's what you're used to -- i.e. you come form other languages that do it that way. I fairly rarely get bitten by indexing 1, rather than zero, but I save a lot of errors that I used to get in MATLAB by the way python does slices: len(a[i:j]) == j - i and: l[:j] + l[j:] == l or: r_[a[:i],a[i:]] == a for numpy arrays. I suppose you could have one-indexing and the python slicing, but I think that would be even more error prone.
zero based indexing made lots of sense since it dovetailed with thinking in memory offsets in systems programming
it also dovetails nicely into using an array to represent a grid of values: i = (X - MinX) / deltaX rather than i = (X - MinX) / deltaX + 1 X = i*deltaX rather than X = (i-1)*deltaX In Fortran, you can choose where you want your array indexing to start, and I found myself starting with zero more often than 1, and I was never a C programmer.
I guess I just want to vent, but also to ask if anyone has found any way to deal with this issue in their own scientific programming.
You'll get used to it. There are disadvantages either way, but after switching from primarily Matlab to primarily Python, I like zero-based indexing better. Perhaps you will too. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On 7/2/06, Webb Sprague
Given the long history of python and its ancestry in C (for which zero based indexing made lots of sense since it dovetailed with thinking in memory offsets in systems programming), there is probably nothing to be done now. I guess I just want to vent, but also to ask if anyone has found any way to deal with this issue in their own scientific programming.
Aha! Guido himself prefers starting the index at one. Here's a code snippet from a fun article he wrote about optimizing python code: import time def timing(f, n, a): print f.__name__, r = range(n) t1 = time.clock() for i in r: f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a); f(a) t2 = time.clock() print round(t2-t1, 3) http://www.python.org/doc/essays/list2str/ Notice he chose t1 and t2 instead of t0 and t1. QED
participants (5)
-
Alan G Isaac
-
Christopher Barker
-
Keith Goodman
-
Simon Burton
-
Webb Sprague