
Hi Travis, The pep contains this sample: """ Nested array :: struct { int ival; double data[16*4]; } """i:ival: (16,4)d:data: """ """ I think it is wrong and must be changed to the following; is this correct? """ Nested array :: struct { int ival; double data[16][4]; } """i:ival: (16,4)d:data: """ """ Thomas

Thomas Heller wrote:
I responded off list to this email and wanted to summarize my response for others to peruse. Basically, the answer is that the struct syntax proposed for multi-dimensional arrays is not intended to mimic how the C-compiler handles statically defined C-arrays (i.e. the pointer-to-pointers style of multi-dimensional arrays). It is intended to handle the contiguous-block-of-data style of multi-dimensional arrays that NumPy uses. I wanted to avoid 2-d static arrays in the examples because it gets confusing and AFAIK the layout of the memory for a double data[16][4] is the same as data[16*4]. The only difference is how the C-compiler translates data[4][3] and data[4]. The intent of the struct syntax is to handle describing memory. The point is not to replicate how the C-compiler deals with statically defined N-D arrays. Thus, even though the struct syntax allows *communicating* the intent of a contiguous block of memory inside a structure as an N-d array, the fundamental memory block is the equivalent of a 1-d array in C. So, I think the example is correct (and intentional). -Travis O.

Travis, Perhaps you can add this rationale to the PEP? It seems helpful and might stave off future confusion. --Guido On Jan 23, 2008 8:17 AM, Travis Oliphant <oliphant.travis@ieee.org> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Travis Oliphant schrieb: [...]
Sorry, I do not think so. If you use a 2-d array in the example, you must describe it correctly. The difference between this pep and the old buffer interface is that the pep allows to describe both how the compiler sees the memory block plus the size and layout of the memory block, while the old buffer interface only describes single-segment memory blocks. And 'double data[16][4]' *is* a single memory block containing a 2-d array, and *not* an array of pointers. --- Here is another typo (?) in the pep; I think it should be changed: Index: pep-3118.txt =================================================================== --- pep-3118.txt (revision 60037) +++ pep-3118.txt (working copy) @@ -338,7 +338,7 @@ ``len`` the total bytes of memory the object uses. This should be the - same as the product of the shape array multiplied by the number of + same as the length of the shape array multiplied by the number of bytes per item of memory. ``readonly`` After all, imo there's a lot to do to fully implement the pep for python 2.6. Thomas

Thomas Heller wrote:
While the original could be reworded ("product of the elements of the shape array"), the amendment is incorrect. For a shape array like {4*5*6}, the number of bytes is (4*5*6)*bytes_per_item, not 3*bytes_per_item. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Thomas Heller wrote:
I don't understand what you mean by "must describe it correctly". The size and layout of the memory block description of the PEP is not supposed to be dependent on the C-compiler. It should also be able to define memory as used in Fortran, C#, a file, or whatever. So, I don't understand the insistence that the example use C-specific 2-d array syntax. The example as indicated is correct. It is true that the 2-d nature of the block of data is only known by Python in this example. You could argue that it would be more informative by showing the C-equivalent structure as a 2-d array. However, it would also create the possibility of confusion by implying an absolute relationship between the C-compiler and the type description. Your insistence that the example is incorrect makes me wonder what point is not being communicated between us. Clearly there is overlap between C structure syntax and the PEP syntax, but the PEP type syntax allows for describing data in ways that the C compiler doesn't. I'd rather steer people away from statically defined arrays in C and don't want to continually explain how they are subtly different. My perception is that you are seeing too much of a connection between the C-compiler and the PEP description of memory. Perhaps that's not it, and I'm missing something else. Best regards, -Travis O.

On 2/11/08, Travis Oliphant <oliphant.travis@ieee.org> wrote:
Travis, all this make me believe that (perhaps) the 'format' specification in the new buffer interface is missing the 'C' or 'F' ordering in the case of a countiguos block. I'm missing something? Or should we always assume a 'C' ordering? -- Lisandro Dalcín --------------- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594

Lisandro Dalcin wrote:
Not strictly necessary, since you can always reverse the indices when dealing with Fortran if need be. You would have to do that anyway when accessing the array from C, so it's probably better to have the description always match the C ordering. (Makes things a bit harder for those people writing their Python extensions in Fortran, of course. :-) -- Greg

Lisandro Dalcin wrote:
There is an ability to specify 'F' for the overall buffer. In the description of each element, however, (i.e. in the struct-syntax), the multi-dimensional character is always communicated in 'C' order (last-dimension varies the fastest). I thought about adding the ability to specify the multi-dimensional order as 'F' in the struct-syntax for each element, but felt against it as you can simulate 'F' order by thinking of the array in transpose fashion: i.e. your 3x5 Fortran-order array is really a 5x3 (C-order array). Of course, the same is true on the larger scale when we are talking about multi-dimensional arrays of "elements," but on that level connecting with Fortran libraries is much more common and so we have found the help useful in NumPy. -Travis O.

Travis Oliphant wrote:
Just to check on something here -- does the C standard guarantee that int a[16][4]; and int b[64]; have the same memory layout, or is it allowed to insert padding at the ends of the rows or something? If they are guaranteed to have the same layout, then I'd agree that the example is correct, but perhaps somewhat confusing. It might help to add a note to the effect that this example is meant to illustrate that the descriptor doesn't have to exactly match the C description, as long as it describes the same memory layout. -- Greg

Greg Ewing wrote:
This sounds like a good idea to me. I doubt Thomas will be the last person to miss the distinction between how the C compiler thinks the memory is arranged (just a simple list of values) and how the application interprets that memory (a 2-dimensional array). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org

Thomas Heller wrote:
I responded off list to this email and wanted to summarize my response for others to peruse. Basically, the answer is that the struct syntax proposed for multi-dimensional arrays is not intended to mimic how the C-compiler handles statically defined C-arrays (i.e. the pointer-to-pointers style of multi-dimensional arrays). It is intended to handle the contiguous-block-of-data style of multi-dimensional arrays that NumPy uses. I wanted to avoid 2-d static arrays in the examples because it gets confusing and AFAIK the layout of the memory for a double data[16][4] is the same as data[16*4]. The only difference is how the C-compiler translates data[4][3] and data[4]. The intent of the struct syntax is to handle describing memory. The point is not to replicate how the C-compiler deals with statically defined N-D arrays. Thus, even though the struct syntax allows *communicating* the intent of a contiguous block of memory inside a structure as an N-d array, the fundamental memory block is the equivalent of a 1-d array in C. So, I think the example is correct (and intentional). -Travis O.

Travis, Perhaps you can add this rationale to the PEP? It seems helpful and might stave off future confusion. --Guido On Jan 23, 2008 8:17 AM, Travis Oliphant <oliphant.travis@ieee.org> wrote:
-- --Guido van Rossum (home page: http://www.python.org/~guido/)

Travis Oliphant schrieb: [...]
Sorry, I do not think so. If you use a 2-d array in the example, you must describe it correctly. The difference between this pep and the old buffer interface is that the pep allows to describe both how the compiler sees the memory block plus the size and layout of the memory block, while the old buffer interface only describes single-segment memory blocks. And 'double data[16][4]' *is* a single memory block containing a 2-d array, and *not* an array of pointers. --- Here is another typo (?) in the pep; I think it should be changed: Index: pep-3118.txt =================================================================== --- pep-3118.txt (revision 60037) +++ pep-3118.txt (working copy) @@ -338,7 +338,7 @@ ``len`` the total bytes of memory the object uses. This should be the - same as the product of the shape array multiplied by the number of + same as the length of the shape array multiplied by the number of bytes per item of memory. ``readonly`` After all, imo there's a lot to do to fully implement the pep for python 2.6. Thomas

Thomas Heller wrote:
While the original could be reworded ("product of the elements of the shape array"), the amendment is incorrect. For a shape array like {4*5*6}, the number of bytes is (4*5*6)*bytes_per_item, not 3*bytes_per_item. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco

Thomas Heller wrote:
I don't understand what you mean by "must describe it correctly". The size and layout of the memory block description of the PEP is not supposed to be dependent on the C-compiler. It should also be able to define memory as used in Fortran, C#, a file, or whatever. So, I don't understand the insistence that the example use C-specific 2-d array syntax. The example as indicated is correct. It is true that the 2-d nature of the block of data is only known by Python in this example. You could argue that it would be more informative by showing the C-equivalent structure as a 2-d array. However, it would also create the possibility of confusion by implying an absolute relationship between the C-compiler and the type description. Your insistence that the example is incorrect makes me wonder what point is not being communicated between us. Clearly there is overlap between C structure syntax and the PEP syntax, but the PEP type syntax allows for describing data in ways that the C compiler doesn't. I'd rather steer people away from statically defined arrays in C and don't want to continually explain how they are subtly different. My perception is that you are seeing too much of a connection between the C-compiler and the PEP description of memory. Perhaps that's not it, and I'm missing something else. Best regards, -Travis O.

On 2/11/08, Travis Oliphant <oliphant.travis@ieee.org> wrote:
Travis, all this make me believe that (perhaps) the 'format' specification in the new buffer interface is missing the 'C' or 'F' ordering in the case of a countiguos block. I'm missing something? Or should we always assume a 'C' ordering? -- Lisandro Dalcín --------------- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594

Lisandro Dalcin wrote:
Not strictly necessary, since you can always reverse the indices when dealing with Fortran if need be. You would have to do that anyway when accessing the array from C, so it's probably better to have the description always match the C ordering. (Makes things a bit harder for those people writing their Python extensions in Fortran, of course. :-) -- Greg

Lisandro Dalcin wrote:
There is an ability to specify 'F' for the overall buffer. In the description of each element, however, (i.e. in the struct-syntax), the multi-dimensional character is always communicated in 'C' order (last-dimension varies the fastest). I thought about adding the ability to specify the multi-dimensional order as 'F' in the struct-syntax for each element, but felt against it as you can simulate 'F' order by thinking of the array in transpose fashion: i.e. your 3x5 Fortran-order array is really a 5x3 (C-order array). Of course, the same is true on the larger scale when we are talking about multi-dimensional arrays of "elements," but on that level connecting with Fortran libraries is much more common and so we have found the help useful in NumPy. -Travis O.

Travis Oliphant wrote:
Just to check on something here -- does the C standard guarantee that int a[16][4]; and int b[64]; have the same memory layout, or is it allowed to insert padding at the ends of the rows or something? If they are guaranteed to have the same layout, then I'd agree that the example is correct, but perhaps somewhat confusing. It might help to add a note to the effect that this example is meant to illustrate that the descriptor doesn't have to exactly match the C description, as long as it describes the same memory layout. -- Greg

Greg Ewing wrote:
This sounds like a good idea to me. I doubt Thomas will be the last person to miss the distinction between how the C compiler thinks the memory is arranged (just a simple list of values) and how the application interprets that memory (a 2-dimensional array). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org
participants (7)
-
Greg Ewing
-
Guido van Rossum
-
Lisandro Dalcin
-
Nick Coghlan
-
Robert Kern
-
Thomas Heller
-
Travis Oliphant