
I'd like input from anyone interested in the following syntax questions. The current experimental Cython syntax for efficient ndarray indexing looks like this: cdef numpy.ndarray[numpy.int64, 2] arr Issue 1: One can argue that "2" above looks like a length specifier to newcomers. Issue 2: This clashes with some forms of C array notation (you can do "sizeof(int[3])" or "extern ... def myfunc(int[])" in Cython, and though such syntax is redundant it will stay in order not to break code). Solving issue #1 helps with #2 as well because without a single integer literal in it it should look a lot less like an array declaration. So, some quick proposals: - Require a "D" suffix for the number of dimensions. This should make it very clear that the 2 stands for dimensionality and not shape. cdef numpy.ndarray[numpy.int64, 2D] arr Variations: a) numpy.ndarray[numpy.int64, 2D] b) numpy.ndarray[numpy.int64 2D] c) numpy.ndarray[2D numpy.int64] (Note that I want to leave the way open for adding mode="c" or mode="fortran", so using a "," seems good for that reason). - Require an ndim keyword: cdef numpy.ndarray[numpy.int64, ndim=2] - Other type of brackets. This boils down to (for various reasons) the <> brackets: cdef numpy.ndarray<numpy.int64, 2> (However, one should remember that Cython may want to support using C++ templates at some point). -- Dag Sverre

2008/8/6 Dag Sverre Seljebotn <dagss@student.matnat.uio.no>:
- Require an ndim keyword:
cdef numpy.ndarray[numpy.int64, ndim=2]
I'd definitely prefer a comma between the two, and an (optional) ndim keyword argument if possible. Looking forward to hearing more! Regards Stéfan

Stéfan van der Walt wrote:
2008/8/6 Dag Sverre Seljebotn <dagss@student.matnat.uio.no>:
- Require an ndim keyword:
cdef numpy.ndarray[numpy.int64, ndim=2]
I'd definitely prefer a comma between the two, and an (optional) ndim keyword argument if possible.
I'm taking this as a vote in favor of this and against "2D" and <>? The keyword is already present in optional form (you can do already do "ndarray[ndim=3, dtype=numpy.int64]" if you want to). So I meant to ask whether making it mandatory is a good solution so that the 2 doesn't look like a length specifier. (ndim=2 seems too long for me though, which is why I am pondering "2D".)
Looking forward to hearing more!
It is looking bright and some experimental support will be present in the next Cython release. What might be missing first time around is support for complex numbers, records/structs and object dtypes. (As there is (at least not yet) no native complex number support in Cython you would need to handle a complex struct manually anyway. I was hoping to fix this but I won't have time within the GSoC.) -- Dag Sverre

2008/8/6 Dag Sverre Seljebotn <dagss@student.matnat.uio.no>:
cdef numpy.ndarray[numpy.int64, ndim=2]
I'd definitely prefer a comma between the two, and an (optional) ndim keyword argument if possible.
I'm taking this as a vote in favor of this and against "2D" and <>?
The keyword is already present in optional form (you can do already do "ndarray[ndim=3, dtype=numpy.int64]" if you want to). So I meant to ask whether making it mandatory is a good solution so that the 2 doesn't look like a length specifier.
I prefer having the `ndim` present -- immediately, the code becomes more transparent to a foreign eye.
(ndim=2 seems too long for me though, which is why I am pondering "2D".)
The length doesn't bother me much (I prefer typing the extra 4 characters to make the code more readable). Personally, I'd prefer not to use 2D -- it's surprising to see a string like that without enclosing quotes (you'd never see it in Python, for example).
It is looking bright and some experimental support will be present in the next Cython release. What might be missing first time around is support for complex numbers, records/structs and object dtypes.
That is great news; please keep us up to date. Thanks for all your hard work! Regards Stéfan

Hello, If I use the "-O" switch then it seems getting some testcase failures, and finally a windows message that "python.exe has encountered a problem and needs to close. We are sorry for the inconvenience.". Running the testsuite via jepp (jepp.sourceforge.net) gives the same failures, plus 8 errors which it succeeds to report without the crash. Unfortunately the "-O" switch is defaulted to "on" for that interpreter. Am I doing something silly? Thanks, Jon D:\wright\build_software\embed_numpy>c:\python25\python -O -c "import numpy; numpy.test()" Numpy is installed in c:\python25\lib\site-packages\numpy Numpy version 1.1.1 Python version 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit (Intel)] Found 18/18 tests for numpy.core.tests.test_defmatrix [.snip.] Found 16/16 tests for numpy.testing.tests.test_utils Found 6/6 tests for numpy.tests.test_ctypeslib ........................................................................................................................ .............................F.......................................................................................... ........................................................................................................................ ........................................................................................................................ ..............................................................F......................................................... ........................................................................................................................ ..................................Ignoring "Python was built with Visual Studio 2003; extensions must be built with a compiler than can generate compatible binaries. Visual Studio 2003 was not found on this system. If you have Cygwin installed, you can try compiling with MingW32, by passing "-c mingw32" to setup.py." (one should fix me in fcompiler/compaq.py) ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ........................................................................................................................ ............................................F.F.F.F.FFFFF......... ====================================================================== FAIL: Convolve should raise an error for empty input array. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\core\tests\test_regression.py", line 629, in check_convolve_empty self.failUnlessRaises(AssertionError,np.convolve,[],[1]) AssertionError: AssertionError not raised ====================================================================== FAIL: Convolve should raise an error for empty input array. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\core\tests\test_regression.py", line 629, in check_convolve_empty self.failUnlessRaises(AssertionError,np.convolve,[],[1]) AssertionError: AssertionError not raised ====================================================================== FAIL: Test two arrays with different shapes are found not equal. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 48, in test_array_diffshape self._test_not_equal(a, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test two different array of rank 1 are found not equal. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 34, in test_array_rank1_noteq self._test_not_equal(a, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test two arrays with different shapes are found not equal. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 48, in test_array_diffshape self._test_not_equal(a, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test two different array of rank 1 are found not equal. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 34, in test_array_rank1_noteq self._test_not_equal(a, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test rank 1 array for all dtypes. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 67, in test_generic_rank1 foo(t) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 63, in foo self._test_not_equal(c, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test rank 3 array for all dtypes. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 86, in test_generic_rank3 foo(t) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 82, in foo self._test_not_equal(c, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test arrays with nan values in them. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 100, in test_nan_array self._test_not_equal(c, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test record arrays. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 126, in test_recarrays self._test_not_equal(c, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ====================================================================== FAIL: Test two arrays with different shapes are found not equal. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 111, in test_string_arrays self._test_not_equal(c, b) File "C:\Python25\Lib\site-packages\numpy\testing\tests\test_utils.py", line 20, in _test_not_equal raise AssertionError("a and b are found equal but are not") AssertionError: a and b are found equal but are not ---------------------------------------------------------------------- Ran 1300 tests in 3.078s FAILED (failures=11)

On Wed, Aug 06, 2008 at 10:35:06AM +0200, Dag Sverre Seljebotn wrote:
Stéfan van der Walt wrote:
2008/8/6 Dag Sverre Seljebotn <dagss@student.matnat.uio.no>:
- Require an ndim keyword:
cdef numpy.ndarray[numpy.int64, ndim=2]
I'd definitely prefer a comma between the two, and an (optional) ndim keyword argument if possible.
I'm taking this as a vote in favor of this and against "2D" and <>?
The keyword is already present in optional form (you can do already do "ndarray[ndim=3, dtype=numpy.int64]" if you want to). So I meant to ask whether making it mandatory is a good solution so that the 2 doesn't look like a length specifier.
(ndim=2 seems too long for me though, which is why I am pondering "2D".)
I am definitely +1 on ndim keyword argument. Make it compulsory if you have to. I don't really like 2D. The nice thing about the ndim keyword argument is that is mean we still are using valid Python syntax which means that iit is easier to grok for the user, and which may come in handy one day, as we can use standard Python tools on the expression. And I don't thing that "ndim=2" is too long. Gaël

Gael Varoquaux wrote:
On Wed, Aug 06, 2008 at 10:35:06AM +0200, Dag Sverre Seljebotn wrote:
Stéfan van der Walt wrote:
2008/8/6 Dag Sverre Seljebotn <dagss@student.matnat.uio.no>:
- Require an ndim keyword:
cdef numpy.ndarray[numpy.int64, ndim=2]
Just out of curiousity. What is the problem with using parenthesis for this purpose? cdef numpy.ndarray(dtype=numpy.int64, ndim=2) -Travis

Travis E. Oliphant wrote:
Gael Varoquaux wrote:
On Wed, Aug 06, 2008 at 10:35:06AM +0200, Dag Sverre Seljebotn wrote:
Stéfan van der Walt wrote:
2008/8/6 Dag Sverre Seljebotn <dagss@student.matnat.uio.no>:
- Require an ndim keyword:
cdef numpy.ndarray[numpy.int64, ndim=2]
Just out of curiousity. What is the problem with using parenthesis for this purpose?
cdef numpy.ndarray(dtype=numpy.int64, ndim=2)
There's no technical problem, but we thought that it looked too much like constructor syntax -- it looks like an ndarray is constructed. If one is new to Cython this is what you will assume, at least the [] makes you stop up and think more. (Which, for clarity, I should mention that it is not -- you'd do cdef np.ndarray(dtype=np.int64, ndim=1) buf = \ np.array([1,2,3], dtype=np.int64) to construct a new array and get buffer access to it right away). Also, the argument list on the type is not defined by the ndarray constructor but corresponds to your buffer PEP, and I think using () obscures this fact. For NumPy this is not such a big problem as the argument lists will be rather similar, but for other libraries than NumPy supporting the buffer PEP the argument list may diverge more: cdef MyJpegImage(dtype=unsigned char, ndim=2) jpg = \ MyJpegImage("file.jpg") (and even worse if the keywords are not mandatory). Another example: One can currently do (dropping keywords) cdef object[double, 4, "strided"] buf = ... to get generic 4D strided buffer access that doesn't make assumptions about the class, but the Python object() takes no parameters... (BTW, there is a mechanism so that simpler buffer exporters like MyJpegImage can declare default buffer options, so one is unlikely to see the above in practice -- if the type always has the same dtype and ndim, it is enough to do cdef MyJpegImage jpg = ... For ndarray I use the same mechanism to let mode="strided" by default. But I should start writing documentation rather than making this email longer :-) ) -- Dag Sverre

Dag Sverre Seljebotn wrote:
Travis E. Oliphant wrote:
On Wed, Aug 06, 2008 at 10:35:06AM +0200, Dag Sverre Seljebotn wrote:
Stéfan van der Walt wrote:
2008/8/6 Dag Sverre Seljebotn <dagss@student.matnat.uio.no>:
- Require an ndim keyword:
cdef numpy.ndarray[numpy.int64, ndim=2]
Just out of curiousity. What is the problem with using parenthesis for
Gael Varoquaux wrote: this purpose?
cdef numpy.ndarray(dtype=numpy.int64, ndim=2)
There's no technical problem, but we thought that it looked too much like constructor syntax -- it looks like an ndarray is constructed. If one is new to Cython this is what you will assume, at least the [] makes you stop up and think more.
(Which, for clarity, I should mention that it is not -- you'd do
cdef np.ndarray(dtype=np.int64, ndim=1) buf = \ np.array([1,2,3], dtype=np.int64)
to construct a new array and get buffer access to it right away).
I realize that I've given too little context for this discussion. This tends to get rather longwinded, but I'll provide it for whoever is interested. What I am doing is supporting general syntax candy for the buffer PEP (and a backwards compatability layer for earlier Python versions) so that cdef object[float, 2] buf = input acquires a buffer and lets you use it using the indexing operator with 2 native int indices, as well as letting other operations (also any indexing that doesn't have exactly 2 ints) fall through to the underlying object. The most explicit syntax would be cdef ndarray arr = input cdef buffer[float, 2] buf = cython.getbuffer(arr) arr += 4.3 buf[3,2] = 2 But that is very unfriendly to use. A step down in explicitness is cdef buffer(ndarray, float, 2) arr = input arr += 4.3 # falls through to ndarray type arr[3,2] = 2 # uses buffer But overall just adding something to the end of ndarray and make it completely transparent seemed most usable. (Another option would be "cdef ndarray,buffer(float,2) arr = ...", i.e. arr "has two types".). -- Dag Sverre

Dag Sverre Seljebotn wrote:
cdef numpy.ndarray[numpy.int64, ndim=2]
+1 it's very clear what this means. I think the keyword should be required. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

I think the square brackets are very confusing as a numpy user not familiar with CPython. On 8/6/08, Christopher Barker <Chris.Barker@noaa.gov> wrote:
Dag Sverre Seljebotn wrote:
cdef numpy.ndarray[numpy.int64, ndim=2]
+1 it's very clear what this means. I think the keyword should be required.
-Chris
-- Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
participants (7)
-
Christopher Barker
-
Dag Sverre Seljebotn
-
Gael Varoquaux
-
Jon Wright
-
Stéfan van der Walt
-
Tom Denniston
-
Travis E. Oliphant