HOG

Wed Aug 21 14:44:41 EDT 2013

Your ideas seem totally valid to me (if I understand correctly), but how about turning around the order of interpolation:

1. 2-D interpolation (x, y direction)
2. Interpolation in the 3rd dimension, which could then easily be implemented with array slicing ``for i, j in pixel_per_cell: magnitude[i::pixels_per_cell, j::pixels_per_cell] and orientation[i::pixels_per_cell, j::pixels_per_cell]``.

This should be basically the same, but you save some memory as you do not the (sx, sy, nbins) intermediate array.

It would be great if you could open a PR with your code, then we can discuss in there :-)

Regards, Johannes

Am 21.08.2013 um 20:04 schrieb Jean K <jean.kossaifi at gmail.com>:

> Hi,
> 
> Thank you for your answers :)
> 
> @Johannes: For the tri-linear interpolation, you're absolutely right, and I spent a lot of time thinking about it.
> 
> Eventually I thought of something:
> Let sx, sy be the size of the image, nbins the number of desired bins.
> First, we interpolate between the bins, from the original (sx, sy) image to a (sx, sy, nbins) array.
> Then we can notice that, inside each cell, we have pixels_per_cell_x * pixels_per_cell_y histograms, which position in the cell doesn't matter (because we are going to sum them up to have only one histogram per cell).
> We can thus virtually divide each cell in 4, each part being interpolated in the 4 diagonally adjacent sub-cells.
> As a result, each of the 4 sub-cell will be interpolated once in the same cell, and once in the 3 adjacent cells (which is exactly what interpolation is).
> The only thing to do is to multiply by the right coefficient. 
> Here's an image to illustrate: We sum 4 times in the 4 diagonal directions. The coefficient for the sum can be represented by a single matrix which is turned.
> 
> 
> Finally you just sum the histograms in each cell to obtain the (n_cells_x, n_cells_y, nbins) desired orientation_histogram (which you can further normalise block-wise).
> 
> 
> So I implemented a version using this trick, based on the original code, and the result seems to be fast for & 160*160 image.
> However, as I said, I'm not perfectly sure of the result.
> 
> Also, I separated the gradient computation from the binning so that the function can also be used for HOF.
> 
> Maybe I could do a pull request so you can have a look on the code?
> 
> Cheers,
> 
> Jean
> 
> 
> On Wednesday, 21 August 2013 08:06:56 UTC+1, Johannes Schönberger wrote:
> Hi Jean, 
> 
> First of all, I am not an expert regarding HoG… :-) 
> 
> > 1) the way of computing the gradients ( if I'm not mistaking, you use a [-1, 1] filter when they use a centered one [-1, 0, 1]. 
> 
> Not sure why the original author of the implementation did use np.diff rather than central differences or even Sobel / Scharr and the like (apart from performance). It should return much better approximations of the gradient. 
> 
> > 2) They use tri-linear interpolation when here the you seem to use hard binning. 
> 
> The tri-linear interpolation seems to be the original approach, but I do not know of a simple way to implement it in pure Python in a fast way… I guess scipy.ndimage.map_coordinates might be very useful here. 
> 
> I think, these fixes would be both much appreciated! 
> 
> > Also, I tried to write another version, trying to stick as much as possible to Dalal&Triggs version, although I don't really know how to assess the results it produces. Would that be of interest? 
> 
> Yes, definitely. 
> 
> Johannes
> 
> -- 
> You received this message because you are subscribed to the Google Groups "scikit-image" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to scikit-image+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.