Class DescribePointSift<Deriv extends ImageGray<Deriv>>


public class DescribePointSift<Deriv extends ImageGray<Deriv>> extends DescribeSiftCommon

A faithful implementation of the SIFT descriptor.

The descriptor is computed inside of a square grid which is scaled and rotated. Each grid cell is composed of a square sub-region. If the sub-region is 4x4 and the outer grid is 5x5 then a total area of size 20x20 is sampled. For each sub-region a histogram with N bins of orientations is computed. Orientation from each sample point comes from the image's spacial derivative. If the outer grid is 4x4 and the histogram N=8, then the total descriptor will be 128 elements.

When a point is sample, its orientation (-pi to pi) and magnitude sqrt(dx**2 + dy**2) are both computed. A contribution from this sample point is added to the entire descriptor and weighted using trilinear interpolation (outer grid x-y coordinate, and orientation bin), Gaussian distribution centered at key point location, and the magnitude.

There are no intentional differences from the paper. However the paper is ambiguous in some places.

  • Interpolation method for sampling image pixels isn't specified. Nearest-neighbor is assumed and that's what VLFeat uses too.
  • Size of sample region. Oddly enough, I can't find this very important parameter specified anywhere. The suggested value comes from empirical testing.

[1] Lowe, D. "Distinctive image features from scale-invariant keypoints". International Journal of Computer Vision, 60, 2 (2004), pp.91--110.

  • Constructor Details

    • DescribePointSift

      public DescribePointSift(int widthSubregion, int widthGrid, int numHistogramBins, double sigmaToPixels, double weightingSigmaFraction, double maxDescriptorElementValue, Class<Deriv> derivType)
      Configures the descriptor.
      widthSubregion - Width of sub-region in samples. Try 4
      widthGrid - Width of grid in subregions. Try 4.
      numHistogramBins - Number of bins in histogram. Try 8
      sigmaToPixels - Conversion of sigma to pixels. Used to scale the descriptor region. Try 1.5 ??????
      weightingSigmaFraction - Sigma for Gaussian weighting function is set to this value * region width. Try 0.5
      maxDescriptorElementValue - Helps with non-affine changes in lighting. See paper. Try 0.2
  • Method Details

    • setImageGradient

      public void setImageGradient(Deriv derivX, Deriv derivY)
      Sets the image spacial derivatives. These should be computed from an image at the appropriate scale in scale-space.
      derivX - x-derivative of input image
      derivY - y-derivative of input image
    • process

      public void process(double c_x, double c_y, double sigma, double orientation, TupleDesc_F64 descriptor)
      Computes the SIFT descriptor for the specified key point
      c_x - center of key point. x-axis
      c_y - center of key point. y-axis
      sigma - Computed sigma in scale-space for this point
      orientation - Orientation of keypoint in radians
      descriptor - (output) Storage for computed descriptor. Make sure it's the appropriate length first