Class LlahHasher

Direct Known Subclasses:
LlahHasher.Affine, LlahHasher.CrossRatio

public abstract class LlahHasher extends Object
Functions related to computing the hash values of a LLAH feature. This is done by looking at the invariant geometry between points and computing a hash function from their values.
  • Field Details


      public static int DEFAULT_HASH_K
      Recommended K from the paper

      public static int DEFAULT_HASH_SIZE
      The recommended hash size from the paper
    • samples

      protected double[] samples
      Defines the look up table. A binary search is used to effectively find the index of a value
  • Constructor Details

    • LlahHasher

      protected LlahHasher(long hashK, int hashSize)
      Configures the hash function. See JavaDoc for info on variables
  • Method Details

    • getNumberOfInvariants

      public int getNumberOfInvariants(int numPoints)
      Returns the number of invariants given the number of points.
      numPoints - Number of points the hash function is computed from
      Number of invariants the feature will have
    • computeHash

      public void computeHash(List<Point2D_F64> points, LlahFeature output)
      Computes the hashcode and invariant values. Stores result in output
      points - Set of points. Must be ≥ 4.
    • computeInvariants

      public void computeInvariants(List<Point2D_F64> points, double[] invariants, int offset)
      Stores the computed invariants into an array
    • getInvariantSampleSize

      protected abstract int getInvariantSampleSize()
      Number of points required to compute the invariants
    • computeInvariant

      protected abstract double computeInvariant(Combinations<Point2D_F64> combinator)
      Computes the invariants given the set of points
    • discretize

      public int discretize(double invariant)
      Computes the discrete value from the continuous valued invariant
    • learnDiscretization

      public void learnDiscretization(int[] histogram, int histLength, double histMaxValue, int numDiscrete)
      Create a lookup table by sorting then sampling the invariants. This will have the desired property of having a denser set of points where there is a higher density of values. A histogram is required instead of raw values because it becomes intractable quickly for even only a few documents if it's an array..
      histogram - Histogram of invariant values from 0 to maxValue
      histLength - Histogram length.
      histMaxValue - The maximum value in the histogram
      numDiscrete - Number of possible discrete values. Larger values indicate higher resolution in discretation
    • getNumValues

      public int getNumValues()
      Returns the number of possible values