Table of Contents
The HOG features are widely use for object detection. HOG decomposes an image into small squared cells, computes an histogram of oriented gradients in each cell, normalizes the result using a block-wise pattern, and return a descriptor for each cell.
Stacking the cells into a squared image region can be used as an image window descriptor for object detection, for example by means of an SVM.
This tutorial shows how to use the VLFeat
function vl_hog to compute HOG features of various kind
and manipulate them.
Basic HOG computation
We start by considering an example input image:
HOG is computed by calling the vl_hog function:
cellSize = 8 ;
hog = vl_hog(im, cellSize, 'verbose') ;
The same function can also be used to generate a pictorial
rendition of the features, although this unavoidably destroys some of
the information contained in the feature itself. To this end, use the
render command:
imhog = vl_hog('render', hog, 'verbose') ;
clf ; imagesc(imhog) ; colormap gray ;
This should produce the following image:
HOG is an array of cells, with the third dimension spanning feature components:
> size(hog)
ans =
16 16 31
In this case the feature has 31 dimensions. HOG exists in many variants. VLFeat supports two: the UoCTTI variant (used by default) and the original Dalal-Triggs variant (with 2×2 square HOG blocks for normalization). The main difference is that the UoCTTI variant computes bot directed and undirected gradients as well as a four dimensional texture-energy feature, but projects the result down to 31 dimensions. Dalal-Triggs works instead with undirected gradients only and does not do any compression, for a total of 36 dimension. The Dalal-Triggs variant can be computed as
% Dalal-Triggs variant
cellSize = 8 ;
hog = vl_hog(im, cellSize, 'verbose', 'variant', 'dalaltriggs') ;
imhog = vl_hog('render', hog, 'verbose', 'variant', 'dalaltriggs') ;
The result is visually very similar:
Flipping HOG from left to right
Often it is necessary to flip HOG features from left to right (for example in order to model an axis symmetric object). This can be obtained analytically from the feature itself by permuting the histogram dimensions appropriately. The permutation is obtained as follows:
% Get permutation to flip a HOG cell from left to right
perm = vl_hog('permutation') ;
Then these two examples produce identical results (provided that the image contains an exact number of cells:
imHog = vl_hog('render', hog) ;
imHogFromFlippedImage = vl_hog('render', hogFromFlippedImage) ;
imFlippedHog = vl_hog('render', flippedHog) ;
This is shown in the figure:
Other HOG parameters
vl_hog supports other parameters as well. For example,
one can specify the number of orientations in the histograms by the
numOrientations option:
% Specify the number of orientations
hog = vl_hog(im, cellSize, 'verbose', 'numOrientations', o) ;
imhog = vl_hog('render', hog, 'verbose', 'numOrientations', o) ;
Changing the number of orientations changes the features quite significantly:
Another useful option is BilinearOrientations switching
on the bilinear orientation assignment of the gradient (this is not used
in certain implementation like UoCTTI).
% Specify the number of orientations
hog = vl_hog(im,cellSize,'numOrientations', 4) ;
imhog = vl_hog('render', hog, 'numOrientations', 4) ;
resulting in