Comparison of SURF implementations
The SURF descriptor is a state-of-the-art image region descriptor that is invariant with regard to scale, orientation, and illumination. By using an integral image, the descriptor can be computed efficiently across different scales. In recent years it has emerged as one of the more popular and frequently-used feature descriptors, but it is not a trivial algorithm to implement, and several different implementations exist. The following study compares several different libraries to determine relative stability and run-time performance.
|BoofCV||10/2011||Java||Fast but less accurate. See FactoryDescribeRegionPoint.surf()|
|BoofCV-M||10/2011||Java||Accurate but slower. See FactoryDescribeRegionPoint.msurf()|
|OpenCV||2.3.1 SVN r6879||C++||Yes||http://opencv.willowgarage.com/wiki/|
Benchmark Source Code:
|Lower is better.||Lower is better.|
|Higher is better.|
For the sake of those with short attention spans, the summary results are posted first and a discussion of testing methodology follows. The top two plots show feature detection and feature describe speed. Feature detection is the detection of the location and scale of interest points inside the image using each library's implementation of the Fast Hessian detector. A feature is described by estimating its orientation and computing the SURF-64 descriptor. The bottommost plot shows a summary of the descriptors' relative stabilities across a standard set of test images.
One reason for JavaSURF's poor stability is that it only implements an upright version of SURF, so rotated images defeat the descriptor. Not computing orientation helps JavaSURF on the description runtime benchmark, because it has fewer computations to perform. JOpenSURF is a straightforward port of the OpenSURF library to Java and shows comparable stability with the expected hit on runtime performance. JOpenSURF, OpenSURF and BoofCV-M all compute an enhanced version of the SURF descriptor, while the BoofCV descriptor is closer to the SURF paper with some improvements. I suspect that the descriptor computed by the reference library is also an improvement over what was presented in the SURF paper, but the source code is closed, so this theory cannot be directly verified.
OpenCV is a bit of an oddball library as far as SURF is concerned. It did not provide an interface that would allow it to be tested in the same manner as the other libraries, and comments in the code indicated that parts of it are multi-threaded. Every other library tested is single-threaded. Because of these issues, OpenCV's own interest points were used instead of the precomputed ones. Speed-wise, a special test was done for OpenCV where features were detected and described at the same time, which took 1940 (ms) for 6485 features. This was approximately 20% slower than OpenSURF's combined detect and describe time.
The stability benchmark was performed using standardized test images from , which have known transformations. Stability was measured based on the number of correct associations between two images in the data set. The testing procedure for each library is summarized below:
- For each image, detect features (scale and location) using the fast Hessian detector in BoofCV.
- Save results to a file and use the same file for all libraries.
- For each image, compute a feature description (including orientation) for all features found.
- In each image sequence, associate features in the first image to the Nth image, where N > 1.
- Association is done by minimizing Euclidean error.
- Validation is done using reverse association, i.e. the association must be the optimal association going from frame 1 to N and N to 1.
- Compute the number of correct associations.
- An association is correct if it is within 3 pixels of the true location.
Since the transformation is known between images, the true location could have been used. However, in reality features will not lie at the exact point, and a descriptor needs to be tolerant of this type of error. Thus, this is a more accurate measure of the descriptor's strength.
The relative stability metric described above is computed by summing up the total percent of correctly associated features across the whole test data set, and then choosing the library with the best performance. The relative stability is computed by dividing each library's score by the best performer's score.
Configuration: All libraries were configured to describe oriented SURF-64 features as defined in the original SURF paper. JavaSURF does not support orientation estimation. OpenCV forces orientation to be estimated inside the feature detector; therefore it was decided that the lesser evil would be to let OpenCV detect its own features. OpenCV's threshold was adjusted so that it detected about the same number of features.
Each library's speed in describing and detecting features was also benchmarked. Each test was performed several times, but only the best time is shown. Java libraries tended to exhibit more variation than native libraries, although all libraries showed a significant amount of variation from trial to trial.
Only image processing time essential to SURF was measured, not image loading time. This would include the time to convert an image to integral image format, but not the time to convert the image to grayscale, assuming it was possible to not include the grayscale conversion. Elapsed time was measured in the actual application using System.currentTimeMillis() in Java and clock() in C++.
- Kill all extraneous processes.
- Load feature location and size from file.
- Compute descriptors (including orientation) for each feature while recording elapsed time.
- Compute elapsed time 10 times and output best result.
- Run the whole experiment 4 times for each library and record the best time.
- Ubuntu 10.10 64bit
- Quadcore Q6600 2.4 GHz
- Memory 8194 GB
- g++ 4.4.5
- Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Compiler and JRE Configuration:
- All native libraries were compiled with -O3
- Java applications were run with no special flags
Describe Specific Setup:
- input image was boat/img1
- Fast Hessian features from BoofCV
- 6415 Total
Detect Specific Setup:
- Impossible to configure libraries to detect exact same features
- Adjusted detection threshold to top out at around 2000 features
- Octaves: 4
- Scales: 4
- Base Size: 9
- Initial Pixel Skip: 1
Results can be found at the top of the page. OpenCV was omitted from runtime results because it could not be configured identically to the other libraries. A special test was performed just for OpenCV and is discussed above. It is not known what pixel skip was used inside of OpenCV.