Documentation - C API
sift.h File Reference

SIFT (Scale Invariant Feature Transform (SIFT)) More...

#include <stdio.h>
#include "generic.h"

Data Structures

struct  VlSiftKeypoint
 SIFT filter keypoint. More...
struct  VlSiftFilt
 SIFT filter. More...

Typedefs

typedef float vl_sift_pix
 SIFT filter pixel type.

Functions

Create and destroy
VlSiftFiltvl_sift_new (int width, int height, int noctaves, int nlevels, int o_min)
 Create a new SIFT filter.
void vl_sift_delete (VlSiftFilt *f)
 Delete SIFT filter.
Process data
int vl_sift_process_first_octave (VlSiftFilt *f, vl_sift_pix const *im)
 Start processing a new image.
int vl_sift_process_next_octave (VlSiftFilt *f)
 Process next octave.
void vl_sift_detect (VlSiftFilt *f)
 Detect keypoints.
int vl_sift_calc_keypoint_orientations (VlSiftFilt *f, double angles[4], VlSiftKeypoint const *k)
 Calculate the keypoint orientation(s)
void vl_sift_calc_keypoint_descriptor (VlSiftFilt *f, vl_sift_pix *descr, VlSiftKeypoint const *k, double angle)
 Compute the descriptor of a keypoint.
void vl_sift_calc_raw_descriptor (VlSiftFilt const *f, vl_sift_pix const *image, vl_sift_pix *descr, int widht, int height, double x, double y, double s, double angle0)
 Run the SIFT descriptor on raw data.
void vl_sift_keypoint_init (VlSiftFilt const *f, VlSiftKeypoint *k, double x, double y, double sigma)
 Initialize a keypoint from its position and scale.
Retrieve data and parameters
int vl_sift_get_octave_index (VlSiftFilt const *f)
 Get current octave index.
int vl_sift_get_noctaves (VlSiftFilt const *f)
 Get number of octaves.
int vl_sift_get_octave_first (VlSiftFilt const *f)
 Get first octave.
int vl_sift_get_octave_width (VlSiftFilt const *f)
 Get current octave width.
int vl_sift_get_octave_height (VlSiftFilt const *f)
 Get current octave height.
int vl_sift_get_nlevels (VlSiftFilt const *f)
 Get number of levels per octave.
int vl_sift_get_nkeypoints (VlSiftFilt const *f)
 Get number of keypoints.
double vl_sift_get_peak_thresh (VlSiftFilt const *f)
 Get peaks treashold.
double vl_sift_get_edge_thresh (VlSiftFilt const *f)
 Get edges threshold.
double vl_sift_get_norm_thresh (VlSiftFilt const *f)
 Get norm threshold.
double vl_sift_get_magnif (VlSiftFilt const *f)
 Get the magnification factor.
double vl_sift_get_window_size (VlSiftFilt const *f)
 Get the Gaussian window size.
vl_sift_pixvl_sift_get_octave (VlSiftFilt const *f, int s)
 Get current octave data.
VlSiftKeypoint const * vl_sift_get_keypoints (VlSiftFilt const *f)
 Get keypoints.
Set parameters
void vl_sift_set_peak_thresh (VlSiftFilt *f, double t)
 Set peaks threshold.
void vl_sift_set_edge_thresh (VlSiftFilt *f, double t)
 Set edges threshold.
void vl_sift_set_norm_thresh (VlSiftFilt *f, double t)
 Set norm threshold.
void vl_sift_set_magnif (VlSiftFilt *f, double m)
 Set the magnification factor.
void vl_sift_set_window_size (VlSiftFilt *f, double x)
 Set the Gaussian window size.

Detailed Description

Author:
Andrea Vedaldi

Function Documentation

void vl_sift_calc_keypoint_descriptor ( VlSiftFilt f,
vl_sift_pix descr,
VlSiftKeypoint const *  k,
double  angle0 
)
Parameters:
fSIFT filter.
descrSIFT descriptor (output)
kkeypoint.
angle0keypoint direction.

The function computes the SIFT descriptor of the keypoint k of orientation angle0. The function fills the buffer descr which must be large enough to hold the descriptor.

The function assumes that the keypoint is on the current octave. If not, it does not do anything.

int vl_sift_calc_keypoint_orientations ( VlSiftFilt f,
double  angles[4],
VlSiftKeypoint const *  k 
)
Parameters:
fSIFT filter.
anglesorientations (output).
kkeypoint.

The function computes the orientation(s) of the keypoint k. The function returns the number of orientations found (up to four). The orientations themselves are written to the vector angles.

Remarks:
The function requires the keypoint octave k->o to be equal to the filter current octave vl_sift_get_octave. If this is not the case, the function returns zero orientations.
The function requires the keypoint scale level k->s to be in the range s_min+1 and s_max-2 (where usually s_min=0 and s_max=S+2). If this is not the case, the function returns zero orientations.
Returns:
number of orientations found.
void vl_sift_calc_raw_descriptor ( VlSiftFilt const *  f,
vl_sift_pix const *  grad,
vl_sift_pix descr,
int  width,
int  height,
double  x,
double  y,
double  sigma,
double  angle0 
)
Parameters:
fSIFT filter.
gradimage gradients.
descrSIFT descriptor (output).
widthimage width.
heightimage height.
xkeypoint x coordinate.
ykeypoint y coordinate.
sigmakeypoint scale.
angle0keypoint orientation.

The function runs the SIFT descriptor on raw data. Here image is a 2 x width x height array (by convention, the memory layout is a s such the first index is the fastest varying one). The first width x height layer of the array contains the gradient magnitude and the second the gradient angle (in radians, between 0 and \( 2\pi \)). x, y and sigma give the keypoint center and scale respectively.

In order to be equivalent to a standard SIFT descriptor the image gradient must be computed at a smoothing level equal to the scale of the keypoint. In practice, the actual SIFT algorithm makes the following additional approximation, which influence the result:

  • Scale is discretized in S levels.
  • The image is downsampled once for each octave (if you do this, the parameters x, y and sigma must be scaled too).
void vl_sift_delete ( VlSiftFilt f)
Parameters:
fSIFT filter to delete.

The function frees the resources allocated by vl_sift_new().

void vl_sift_detect ( VlSiftFilt f)

The function detect keypoints in the current octave filling the internal keypoint buffer. Keypoints can be retrieved by vl_sift_get_keypoints().

Parameters:
fSIFT filter.

Index GSS

Index matrix A

double vl_sift_get_edge_thresh ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
threshold.
VlSiftKeypoint const * vl_sift_get_keypoints ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
pointer to the keypoints list.
double vl_sift_get_magnif ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
magnification factor.
int vl_sift_get_nkeypoints ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
number of keypoints.
int vl_sift_get_nlevels ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
number of leves per octave.
int vl_sift_get_noctaves ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
number of octaves.
double vl_sift_get_norm_thresh ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
threshold.
vl_sift_pix * vl_sift_get_octave ( VlSiftFilt const *  f,
int  s 
)
inline
Parameters:
fSIFT filter.
slevel index.

The level index s ranges in the interval s_min = -1 and s_max = S + 2, where S is the number of levels per octave.

Returns:
pointer to the octave data for level s.
int vl_sift_get_octave_first ( VlSiftFilt const *  f)
inline

Parameters:
fSIFT filter.
Returns:
index of the first octave.
int vl_sift_get_octave_height ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
current octave height.
int vl_sift_get_octave_index ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
index of the current octave.
int vl_sift_get_octave_width ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
current octave width.
double vl_sift_get_peak_thresh ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
threshold ;
double vl_sift_get_window_size ( VlSiftFilt const *  f)
inline
Parameters:
fSIFT filter.
Returns:
standard deviation of the Gaussian window (in spatial bin units).
void vl_sift_keypoint_init ( VlSiftFilt const *  f,
VlSiftKeypoint k,
double  x,
double  y,
double  sigma 
)
Parameters:
fSIFT filter.
kSIFT keypoint (output).
xx coordinate of the keypoint center.
yy coordinate of the keypoint center.
sigmakeypoint scale.

The function initializes a keypoint structure k from the location x and y and the scale sigma of the keypoint. The keypoint structure maps the keypoint to an octave and scale level of the discretized Gaussian scale space, which is required for instance to compute the keypoint SIFT descriptor.

Algorithm

The formula linking the keypoint scale sigma to the octave and scale indexes is

\[ \sigma(o,s) = \sigma_0 2^{o+s/S} \]

In addition to the scale index s (which can be fractional due to scale interpolation) a keypoint has an integer scale index is too (which is the index of the scale level where it was detected in the DoG scale space). We have the constraints (Detector see also the "SIFT detector"):

  • o is integer in the range \( [o_\mathrm{min}, o_{\mathrm{min}}+O-1] \).
  • is is integer in the range \( [s_\mathrm{min}+1, s_\mathrm{max}-2] \). This depends on how the scale is determined during detection, and must be so here because gradients are computed only for this range of scale levels and are required for the calculation of the SIFT descriptor.
  • \( |s - is| < 0.5 \) for detected keypoints in most cases due to the interpolation technique used during detection. However this is not necessary.

Thus octave o represents scales \( \{ \sigma(o, s) : s \in [s_\mathrm{min}+1-.5, s_\mathrm{max}-2+.5] \} \). Note that some scales may be represented more than once. For each scale, we select the largest possible octave that contains it, i.e.

\[ o(\sigma) = \max \{ o \in \mathbb{Z} : \sigma_0 2^{\frac{s_\mathrm{min}+1-.5}{S}} \leq \sigma \} = \mathrm{floor}\,\left[ \log_2(\sigma / \sigma_0) - \frac{s_\mathrm{min}+1-.5}{S}\right] \]

and then

\[ s(\sigma) = S \left[\log_2(\sigma / \sigma_0) - o(\sigma)\right], \quad is(\sigma) = \mathrm{round}\,(s(\sigma)) \]

In practice, both \( o(\sigma) \) and \( is(\sigma) \) are clamped to their feasible range as determined by the SIFT filter parameters.

VlSiftFilt* vl_sift_new ( int  width,
int  height,
int  noctaves,
int  nlevels,
int  o_min 
)
Parameters:
widthimage width.
heightimage height.
noctavesnumber of octaves.
nlevelsnumber of levels per octave.
o_minfirst octave index.

The function allocates and returns a new SIFT filter for the specified image and scale space geometry.

Setting O to a negative value sets the number of octaves to the maximum possible value depending on the size of the image.

Returns:
the new SIFT filter.
See also:
vl_sift_delete().
int vl_sift_process_first_octave ( VlSiftFilt f,
vl_sift_pix const *  im 
)
Parameters:
fSIFT filter.
imimage data.

The function starts processing a new image by computing its Gaussian scale space at the lower octave. It also empties the internal keypoint buffer.

Returns:
error code. The function returns VL_ERR_EOF if there are no more octaves to process.
See also:
vl_sift_process_next_octave().
int vl_sift_process_next_octave ( VlSiftFilt f)
Parameters:
fSIFT filter.

The function computes the next octave of the Gaussian scale space. Notice that this clears the record of any feature detected in the previous octave.

Returns:
error code. The function returns the error VL_ERR_EOF when there are no more octaves to process.
See also:
vl_sift_process_first_octave().
void vl_sift_set_edge_thresh ( VlSiftFilt f,
double  t 
)
inline
Parameters:
fSIFT filter.
tthreshold.
void vl_sift_set_magnif ( VlSiftFilt f,
double  m 
)
inline
Parameters:
fSIFT filter.
mmagnification factor.
void vl_sift_set_norm_thresh ( VlSiftFilt f,
double  t 
)
inline
Parameters:
fSIFT filter.
tthreshold.
void vl_sift_set_peak_thresh ( VlSiftFilt f,
double  t 
)
inline
Parameters:
fSIFT filter.
tthreshold.
void vl_sift_set_window_size ( VlSiftFilt f,
double  x 
)
inline
Parameters:
fSIFT filter.
xGaussian window size (in units of spatial bin).

This is the parameter \( \hat \sigma_{\text{win}} \) of the standard SIFT descriptor Standard SIFT descriptor.