Hi Juan, Thanks for getting back to my lengthy post! I assumed that it was convention. As you said, we originally followed this scheme when we moved from Matlab and have only recently realised that it makes complete sense for Matlab to have that format, but not for us. It's also true that packages such as PIL return images in that format (presumably to maintain consistency with file formats, as you mentioned). I agree that it makes more sense from a notational perspective, however, channels first allows you to do really nice things like: for channel in image: channel[i, j] which I like a lot. My follow up was merely going to be more of a comment that a question to be honest. I just wanted to make sure I wasn't missing a trick somewhere whereby all my woes could be fixed via some magic I had failed to see. In our package, we implement a lot of image alignment methodologies and thus computations like the gradient etc are more important to us than treating channels as related features. We are very interested in algorithms like HOG, DSIFT, LBP that treat all the channels separately, so I'm seriously considering flipping our image type to get the performance boost. It's just that I also dislike the idea of losing our current simple compatibility with scikit-image! I will think hard on it. Thanks again for the reply! On Wednesday, 16 July 2014 23:17:02 UTC+1, Juan Nunez-Iglesias wrote:
Hey Patrick,
Good question! And one that I've thought about repeatedly. In my own work I usually "unwrap" the channels as the first step of my analysis, for the reasons you've outlined.
All these conventions were in place when I got into image analysis so I can't comment on the history, but I can offer a few hypotheses:
0. Image storage formats group the channels together by default (e.g. TIFF). 1. Matlab uses this convention, where it actually makes computational sense since they use column-major array storage. 2. It makes sense from a notational perspective: image[i, j] -> color. 3. It actually is useful for certain functions, such as SLIC, in which all the channels are considered together.
I'm guessing you are also interested in a follow-up question, namely, is this ever likely to change, to which I would say, probably not, as the convention is too embedded in the collective consciousness. In applications where each channel requires separate processing, storing the channels as separate grayscale files might make sense as a workaround.
Juan.
On Wed, Jul 16, 2014 at 11:18 AM, Patrick Snape <patric...@gmail.com <javascript:>> wrote:
Hi,
I'm one of the maintainers of a project that has some overlap with scikit-image. Recently, I've been working on optimising some of our slower code paths, and I came across an interesting result. In our package, we order our image data in the same way as in scikit-image:
image.shape = [height, width, channels]
Given that cPython runs on top of C, this means that the normal memory layout is generally row-major. Therefore, in the C-contiguous layout of the above memory pattern, it means that the rightmost channel indexes fastest. For example, when iterating over a vectorised image (stolen shamelessly from Wikipedia):
int A[2, 3, 4] = np.array([[[1,2,3,4], [5,6,7,8], [9,10,11,12]], [[13,14,15,16}, [17,18,19,20], [21,22,23,24]])
We walk that memory:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Practically, this means that we can most efficiently walk along the channels.
However! In my experience, especially with 2D images, you usually store most of your relationships across the image and leave the channels as independent of one another. A simple example of this, and the one I was considering, is calculating the gradient of a multi-channel image (so a 3 channel RGB image will return a 6-channel image that represents the x and y gradient over each channel R, G and B).
Now, by calculating the gradient on a [height, width, channels] image, I have to jump a lot further in memory to find my neighbours, because I'm also skipping over channels. This is actually very inefficient, and in my testing (implementing the gradient method in cython) this actually causes a huge bottleneck. For example, the more channels you have, the closer in performance my code comes to Numpy's pure Python gradient implementation. This is because both pieces of code are being hurt by having to work 'against' the memory layout.
A practical example of this can be seen here: http://nbviewer.ipython.org/gist/patricksnape/7b31c14c7103c2995956
As mentioned, I have some code I've been working on that implements the gradient directly in Cython and uses OpenMP to parallelize the gradient over the channels, which is *significantly* faster for the channels first case.
So, my question is: *why did you choose to place the channels on the last axis rather than the first?*
-- You received this message because you are subscribed to the Google Groups "scikit-image" group. To unsubscribe from this group and stop receiving emails from it, send an email to scikit-image...@googlegroups.com <javascript:>. For more options, visit https://groups.google.com/d/optout.