In my view, the most important reason to prefer 1-based indexing versus 0-based indexing is compatibility. For numerical work, some of the languages which I use or have used are Matlab, Mathematica, Maple and Fortran. These are all 1-indexed. (C is by nature 0-indexed because it is so close to machine architecture, but with a little bit of not-entirely-clean pointer manipulation, you can easily make 1-indexed arrays and matrices.) Obviously Python can't be compatible with these languages in a strict sense, but like most people who do some programming work, I've built up a library of my own commonly used routines specific to my work; in general it's a trivial matter to translate numerical routines from one language to the another if translation is just a matter of substituting of one set of syntactical symbols and function names for anther. However it can be damn tricky to convert 1-indexed code to 0-indexed code or visa versa without introducing any errors- believe me! (Yes it's possible to call nearly any language from nearly any other language these days so in theory you don't have to recode, but there are lots of reasons why often recoding is the preferable route.)
You aren't the first to raise this issue. I wouldn't mind if the user had the option, but then I again I tend to prefer the flag-for-every-feature approach which others who have more computing experience than me have said leads to problems due to the presence of many different ways to do things and unforseen interaction.s I could definitely see the coding advantage in dealing with implementing algorithms that uses notation that is already 1-based. I have come across this myself -- in fact just yesterday when I was coding up the Pade approximation to the matrix exponential using the pseudo-code algorithm given by Golub and Van Loan in their "Matrix Computations" book. It seems to me like it would be a lot of work to add this feature back into the code now (there would be a million places to look for places where the code inherently assumes 0-based indexing). It would also, as you mention, be inconsistent with Python. A general approach would be to inherit from the UserArray for your codes and reimplement the __getitem__ and __getslice__ commands. Your objects should still be able to be passed to many of the routines which expect arrays (because under the covers one of the first things the array_from_object C-code does is check to see if the object has an __array__ method and calls it). Note that this will not copy data around so there is minimal overhead. But, you would have to take care to wrap the returned object back into an array_object. (Maybe something could be done here...Hmmm.)
By the way, what is leave-last-one-out slicing? Is it a[:-1] or is it a[0,...] or is it something else?
I meant the fact that a[3:6] returns elements a[3], a[4], a[5] but NOT a[6]. I'm sorry for using my own poorly-worded term. I can't remember what other Pythonistas call it.