From BoofCV
Revision as of 06:41, 27 October 2011 by Peter (talk | contribs)
Jump to navigationJump to search

Comparison of SURF implementations

The SURF descriptor is a state of the art image region descriptor that is scale, orientation, and illumination invariant. By using an integral image it can be computed efficiently across different scales. In recent years it has emerged as one of the more popular and frequently used feature descriptors, but it is not a trivial algorithm to implement and several different implementations exist. The following study compares several different libraries against each other to determine their relative stability and runtime performance.

Tested Implementations:

Implementation Version Language Comment
BoofCV: SURF 10/2011 Java Fast but less accurate. See
BoofCV: MSURF 10/2011 Java Accurate but slower. See FactoryDescribeRegionPoint.msurf()
OpenSURF 27/05/2010 C++
Reference 1.0.9 C++
JOpenSURF SVN r24 Java
JavaSURF SVN r4 Java

Benchmark Source Code:


overall_describe_speed.gif overall_detect_speed.gif
Lower is better. Lower is better.
Higher is better.

For sake of those with no attention span, the summary results are posted first and a discussion of testing methodologies following below. The top two plots show how fast each library is at detecting and describing features. Detection is when the location and scale of interest points are detected inside the image using each library's implementation of the Fast Hessian detector. A feature is described by computing the SURF-64 description. The bottom most plot shows a summary of the descriptors relative stability across a standard set of test images. The description stability metric was computed by finding the sum of all correct associations through out the entire image data set then dividing the number by the best result.

One reason for JavaSURF's poor stability performance is that is only implements an upright version of SURF, so images with any rotation cause it to fail. Not computing the orientation also helps JavaSURF on the description runtime benchmark because it has fewer computations to perform. JOpenSURF is a straight forward port of the OpenSURF library to Java and shows comparable stability with the expected hit on runtime performance. JOpenSURF, OpenSURF and BoofCV-M compute an enhanced version of the SURF descriptor, while the BoofCV descriptor is closer to the SURF paper with some improvements. I suspect that the descriptor computed by the reference library is also an improvement over what was presented in the SURF paper, but source code is closed so this theory cannot be directly verified. The good performance of BoofCV library is primarily because of tweaks to the algorithm and better optimizations, not due to any sort of Java magic. A good C++ port would run even faster.

OpenCV is not included because unlike all the other libraries it computes the orientation when the feature is detected. Making it impossible to use the same testing techniques as was used on the other libraries.

Descriptor Stability

Tests were performed using standardized test images from [1], which have known transformations. Because the transformation between images is known this allows the true associations to be known. Stability was measured based upon the number of correct associations between two images in the dataset. The testing procedure for each library is summarized below:

  1. For each image, detect features (scale and location) using the fast Hessian detector in BoofCV.
    • Save results to a file and use the same file for all libraries.
  2. For each image, compute a feature description (including orientation) for all found features.
  3. In each image sequence, associate features in the first image to the Nth image, where N > 1.
    • Association is done by minimizing Euclidean error
    • Validation is done using reverse association. E.g. This association must be the optimal association going from frame 1 to N and N to 1.
  4. Compute the number of correct associations.
    • An association is correct if it is within 3 pixels of the true location.

Since the transformation is known between images the true location could have been used. However, in reality features will not lie at the exact point and a descriptor needs to be tolerant to this type of errors. Thus this is a more accurate measure of the description's strength.

Configuration: All libraries were configured to describe oriented SURF-64 features as defined in the original SURF paper. JavaSURF does not support orientation estimation.

Stability Results

stability_bike.gif stability_boat.gif
stability_graf.gif] stability_leuven.gif
stability_ubc.gif stability_trees.gif
stability_wall.gif stability_bark.gif

Runtime Speed

How fast enough library can compute the description and detect features was also benchmarked. Each test was performed several times with only the best time being shown. Java libraries tended to exhibit more variability than native libraries, while all libraries showed a significant amount of variability from trial to trial.

Only image processing time essential to SURF was measured and not loading in images. This would include converting an image to integral image format, but not converting the image to gray scale. Assuming that it was possible to not include the gray scale conversion. Elapsed time was measured in the actual application using System.currentTimeMillis() in Java and clock() in C++.

Testing Procedure:

  1. Kill all extraneous processes.
  2. Load feature location and size from file.
  3. Compute descriptors (including orientation) for each feature while recording elapsed time.
  4. Compute elapsed time 10 times and output best result.
  5. Run the whole experiment 4 times for each library and record the best time.

Test Computer:

  • Ubuntu 10.10 64bit
  • Quadcore Q6600 2.4 GHz
  • Memory 8194 GB
  • g++ 4.4.5
  • Java(TM) SE Runtime Environment (build 1.6.0_26-b03)

Compiler and JRE Configuration

  • All native libraries were compiled with -O3
  • Java applications were run with no special flags

Describe Specific Setup:

  • input image was boat/img1
  • Fast Hessian features from BoofCV
    • 6415 Total

Detect Specific Setup:

  • Impossible to configure libraries to detect exact same features.
    • Adjusted detection threshold to top out at around 2000 features
  • Octaves: 4
  • Scales: 4
  • Base Size: 9
  • Initial Pixel Skip: 1

Results can be found at the top of the page.