Class RecognitionVocabularyTreeNister2006<Point>

All Implemented Interfaces:

public class RecognitionVocabularyTreeNister2006<Point> extends Object implements VerbosePrint
Image recognition based off of [1] using inverted files. A HierarchicalVocabularyTree is assumed to have been already trained. When an image is added to the database a TF-IDF descriptor is computed using the tree and then added to the relevant tree's leaves. When an image is looked up its TF-IDF descriptor is found then all images in the data base are found that share at least one leaf node. These candidate matches are then compared against each other and scored using L2-Norm.

Implementation Notes:
This implementation is intended to produce output which is faithful to the original work [1] but has several modifications internally where there has been an attempt to improve runtime performance, often at the cost of an increase in memory consumption. A non-exhaustive set of deviations is listed below

  • Taking inspiration from [2], this implementation has an explicit representation of the inverted files in non-leaf nodes. This avoid an expensive graph traversal step and replaces it with a very fast array look up.
  • Histogram weights are stored in inverted files instead of word counts. Allows more efficient error computation.

[1] Nister, David, and Henrik Stewenius. "Scalable recognition with a vocabulary tree." 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). Vol. 2. Ieee, 2006.
[2] Esteban Uriza, Francisco Gómez Fernández, and Martín Rais, "Efficient Large-scale Image Search With a Vocabulary Tree", Image Processing On Line, 8 (2018), pp. 71–98

  • Field Details

    • tree

      Vocabulary Tree
    • minimumDepthFromRoot

      public int minimumDepthFromRoot
      A node can be part of the descriptor if it's at least this far from the root node
    • maximumQueryImagesInNode

      public ConfigLength maximumQueryImagesInNode
      If a node has an inverted file list greater than this amount then it will be skipped when scoring. This should be viewed as a last ditch effort when the query is too slow. If there are a 1,000,000 images in the DB, then 20,000 seems to be a reasonable number.
    • invertedFiles

      public final GrowArray<InvertedFile> invertedFiles
      User data associated with each node
    • imagesDB

      protected final BigDogArray_I32 imagesDB
      List of images added to the database
    • matches

      protected final DogArray<BowMatch> matches
      Scores for all candidate images which have been sorted
    • distanceFunction

      protected TupleMapDistanceNorm distanceFunction
      Distance between two TF-IDF descriptors. L1 and L2 norms are provided
    • featureIdxToLeafID

      protected final DogArray_I32 featureIdxToLeafID
      Stores a mapping from feature index to leaf ID
    • frequencies

  • Constructor Details

    • RecognitionVocabularyTreeNister2006

      public RecognitionVocabularyTreeNister2006()
  • Method Details

    • initializeTree

      public void initializeTree(HierarchicalVocabularyTree<Point> tree)
      Configures the tree by adding LeafData to all the leaves in the tree then saves a reference for future use
      tree - Three which is to be used as the database. Saved internally.
    • clearImages

      public void clearImages()
      Removes all images from the database.
    • addImage

      public void addImage(int imageID, List<Point> imageFeatures)
      Adds a new image to the database.
      imageID - The image's unique ID for later reference
      imageFeatures - Feature descriptors from an image
    • query

      public boolean query(List<Point> queryImage, @Nullable BoofLambdas.FilterInt filter, int limit)
      Looks up the best BowMatch from the database. The list of all potential matches can be accessed by calling #getMatches().
      queryImage - Set of feature descriptors from the query image
      filter - Filter which can be used to reject matches that the user doesn't want returned. False = reject.
      limit - Maximum number of matches it will return.
      The best matching image with score from the database
    • findAndScoreMatches

      protected void findAndScoreMatches(List<Point> queryImage)
      Uses the inverted file for each word to create a list of potential matches while scoring the matches efficiently
    • describe

      protected void describe(List<Point> imageFeatures, DogArray_F32 descWeights, DogArray_I32 descWords)
      Given the image features, compute a sparse descriptor for the image and pass in leaf nodes to 'op' for each image feature.
      imageFeatures - (Input) All image features in the image
      descWeights - (Output) Weights for non-zero word in TD-IDF descriptor for this image
      descWords - (Output) Word index for non-zero word in TD-IDF descriptor for this image
    • setDistanceType

      public void setDistanceType(BowDistanceTypes type)
      Used to change distance function to one of the built in types
    • setVerbose

      public void setVerbose(@Nullable @Nullable PrintStream out, @Nullable @Nullable Set<String> settings)
      Specified by:
      setVerbose in interface VerbosePrint