Documentation - C API
aib.c File Reference

AIB - Definition. More...

#include "aib.h"
#include <stdio.h>
#include <stdlib.h>
#include <float.h>
#include <math.h>

Functions

void vl_aib_normalize_P (double *P, vl_uint nelem)
 Normalizes an array of probabilities to sum to 1.
vl_uintvl_aib_new_nodelist (vl_uint nentries)
 Allocates and creates a list of nodes.
double * vl_aib_new_Px (double *Pcx, vl_uint nvalues, vl_uint nlabels)
 Allocates and creates the marginal distribution Px.
double * vl_aib_new_Pc (double *Pcx, vl_uint nvalues, vl_uint nlabels)
 Allocates and creates the marginal distribution Pc.
void vl_aib_min_beta (VlAIB *aib, vl_uint *besti, vl_uint *bestj, double *minbeta)
 Find the two nodes which have minimum beta.
void vl_aib_merge_nodes (VlAIB *aib, vl_uint i, vl_uint j, vl_uint new)
 Merges two nodes i,j in the internal datastructure.
void vl_aib_update_beta (VlAIB *aib)
 Updates aib->beta and aib->bidx according to aib->which.
void vl_aib_calculate_information (VlAIB *aib, double *I, double *H)
 Calculates the current information and entropy.
VlAIBvl_aib_new (double *Pcx, vl_uint nvalues, vl_uint nlabels)
 Allocates and initializes the internal data structure.
void vl_aib_delete (VlAIB *aib)
 Deletes AIB data structure.
void vl_aib_process (VlAIB *aib)
 Runs AIB on Pcx.

Detailed Description

Author:
Brian Fulkerson
Andrea Vedaldi

Function Documentation

void vl_aib_calculate_information ( VlAIB aib,
double *  I,
double *  H 
)
Parameters:
aibA pointer to the internal data structure
IThe current mutual information (out).
HThe current entropy (out).

Calculates the current mutual information and entropy of Pcx and sets I and H to these new values.

void vl_aib_delete ( VlAIB aib)
Parameters:
aibdata structure to delete.
void vl_aib_merge_nodes ( VlAIB aib,
vl_uint  i,
vl_uint  j,
vl_uint  new 
)
Parameters:
aibA pointer to the internal data structure
iThe index of one member of the pair to merge
jThe index of the other member of the pair to merge
newThe index of the new node which corresponds to the union of (i, j).

Nodes are merged by replacing the entry i with the union of ij, moving the node stored in last position (called lastnode) back to jth position and the entry at the end.

After the nodes have been merged, it updates which nodes should be considered on the next iteration based on which beta values could potentially change. The merged node will always be part of this list.

void vl_aib_min_beta ( VlAIB aib,
vl_uint besti,
vl_uint bestj,
double *  minbeta 
)
Parameters:
aibA pointer to the internal data structure
bestiThe index of one member of the pair which has mininum beta
bestjThe index of the other member of the pair which minimizes beta
minbetaThe minimum beta value corresponding to (i, j)

Searches aib->beta to find the minimum value and fills minbeta and besti and bestj with this information.

VlAIB* vl_aib_new ( double *  Pcx,
vl_uint  nvalues,
vl_uint  nlabels 
)
Parameters:
PcxA pointer to a 2D array of probabilities
nvaluesThe number of rows in the array
nlabelsThe number of columns in the array

Creates a new VlAIB struct containing pointers to all the data that will be used during the AIB process.

Allocates memory for the following:

  • Px (nvalues*sizeof(double))
  • Pc (nlabels*sizeof(double))
  • nodelist (nvalues*sizeof(vl_uint))
  • which (nvalues*sizeof(vl_uint))
  • beta (nvalues*sizeof(double))
  • bidx (nvalues*sizeof(vl_uint))
  • parents ((2*nvalues-1)*sizeof(vl_uint))
  • costs (nvalues*sizeof(double))

Since it simply copies to pointer to Pcx, the total additional memory requirement is:

(3*nvalues+nlabels)*sizeof(double) + 4*nvalues*sizeof(vl_uint)

Returns:
An allocated and initialized VlAIB pointer
vl_uint* vl_aib_new_nodelist ( vl_uint  nentries)
Parameters:
nentriesThe size of the list which will be created
Returns:
an array containing elements 0...nentries
double* vl_aib_new_Pc ( double *  Pcx,
vl_uint  nvalues,
vl_uint  nlabels 
)
Parameters:
PcxA two-dimensional array of probabilities
nvaluesThe number of rows in Pcx
nlabelsThe number of columns in Pcx
Returns:
an array of size nlabels which contains the marginal distribution over the columns
double* vl_aib_new_Px ( double *  Pcx,
vl_uint  nvalues,
vl_uint  nlabels 
)
Parameters:
PcxA two-dimensional array of probabilities
nvaluesThe number of rows in Pcx
nlabelsThe number of columns in Pcx
Returns:
an array of size nvalues which contains the marginal distribution over the rows.
void vl_aib_normalize_P ( double *  P,
vl_uint  nelem 
)
Parameters:
PThe array of probabilities
nelemThe number of elements in the array
Returns:
Modifies P to contain values which sum to 1
void vl_aib_process ( VlAIB aib)
Parameters:
aibAIB object to process

The function runs Agglomerative Information Bottleneck (AIB) on the joint probability table aib->Pcx which has labels along the columns and feature values along the rows. AIB iteratively merges the two values of the feature x that causes the smallest decrease in mutual information between the random variables x and c.

Merge operations are arranged in a binary tree. The nodes of the tree correspond to the original feature values and any other value obtained as a result of a merge operation. The nodes are indexed in breadth-first order, starting from the leaves. The first index is zero. In this way, the leaves correspond directly to the original feature values. In total there are 2*nvalues-1 nodes.

The results may be accessed through vl_aib_get_parents which returns an array with one element per tree node. Each element is the index the parent node. The root parent is equal to zero. The array has 2*nvalues-1 elements.

Feature values with null probability are ignored by the algorithm and their nodes have parents indexing a non-existent tree node (a value bigger than 2*nvalues-1).

Then the function will also compute the information level after each merge. vl_get_costs will return a vector with the information level after each merge. cost has nvalues entries: The first is the value of the cost functional before any merge, and the others are the cost after the nvalues-1 merges.

void vl_aib_update_beta ( VlAIB aib)
Parameters:
aibAIB data structure.

The function calculates beta[i] and bidx[i] for the nodes i listed in aib->which. beta[i] is the minimal variation of mutual information (or other score) caused by merging entry i with another entry and bidx[i] is the index of this best matching entry.

Notice that for each entry i that we need to update, a full scan of all the other entries must be performed.