3D Reconstruction on Desktop Tutorial

From BoofCV
Revision as of 20:19, 18 October 2021 by Peter (talk | contribs) (→‎Building)
Jump to navigationJump to search

This tutorial is targeted at people who want to use BoofCV's 3D reconstruction pipeline, but are not developers and are not afraid to type commands in to the command line. 3D Reconstruction / Photogrammetry in BoofCV will be a work in progress for a while. The fact that it works as well as it does is surprising.

Setting Expectations

The initial development was focused on processing video sequences where it selects key frames to use. This is where it works best right now. However, most people like to take still photos and then create a 3D model from that. It's a bit hit or miss with sets of images. What I've noticed is that people don't have as much overlap between views and that makes things more difficult.

Good

  • Video sequences with a mixture of close up and far away objects
  • Highly textured scenes
  • Translational motion sideways from the object
  • No calibration or hints required!
  • Can split a video up into different objects automatically

Bad

  • Drone footage is hit or miss
  • Glare, smooth textureless objects, reflections, ... etc
  • Does not use all the information available (EXIF, IMU, ... )
  • Can't link all the images into a single scene

Informal Comparison to Meshroom/AliceVision

By visual inspection it was clear that Meshroom does a better and more stable job in almost all scenarios for sparse reconstruction. The one exception was the 3-view case where BoofCV could often converge to a solution when Meshroom could not. While hard to compare, it does look like BoofCV's dense reconstruction might be better. Meshing needs to be added to BoofCV before a direct comparison can be done.

BoofCV and AliceVision take two fairly different approaches to reconstruction, it should be interesting to see if the 3-view + self-calibration approach used in BoofCV can become as stable. AliceVision takes a more traditional approach where it guesses (using a nominal value) the focal length then expands the scene using 2-view pairs.


Requirements

These instructions are written for Ubuntu. You will need to have Java installed but everything else should take care of itself.

Building

You will need to latest and greatest code from Github and build the applications jar.

  1. Checkout the source code as described here
  2. Build the code as described here.
  3. Run this command:
    ./gradlew applicationsJar

Example Data

An attempt was made to not cherry pick examples too much. It should work on all of these if you follow the instrucitons below but it will not work perfectly. Most likely it will have issues linking all the images together and will use only a subset. If it does manage to link all the images the results might look worse since the code currently makes no attempt to filter out noisy points on the edges of objects. You get the idea.

Processing a Video

First you will need a video to process. Grab one of the examples above. Next you will need to 1) convert the video to a sequence of images, 2) downsample the images, 3) select key frames, 4) then perform the reconstruction. Here's how you can do all of that:

mkdir images
ffmpeg -i your_video.mp4 images/frame%04d.png
java -jar /path/to/boofcv/applications/applications.jar DownSelectVideoFramesFor3DApp -i images -o small -w 800 --MaxLength --MaxMotion 0.20 --MinMotion 0.05
java -jar /path/to/boofcv/applications/applications.jar SceneReconstructionApp -i small/ --Ordered -o reconstruction --ShowCloud --Verbose

If the reconstruction doesn't work very well you can try adding the '--TryHarder' flag, see below. If that doesn't work then see what happens when you process it as an unordered set of images.

The reconstruction (and debugging information) goes into the output directory 'reconstruction'. Most people are probably interested in the final cloud.ply document. There's a good chance that if you view it in a 3rd party application there will be scaling issues because BoofCV currently does not filter out points that are close to infinity...

Processing Images

If you already have the set of images and you don't need to prune any of them then you can just invoke the command below:

java -jar /path/to/boofcv/applications/applications.jar SceneReconstructionApp -i path/to/images/ --TryHarder -o reconstruction --ShowCloud --Verbose

There are two key differences from the video example. There is no '--Ordered' flag which tells it the images are not in any particular order and '--TryHarder' flag. TryHarder isn't a requirement but it will increase the change of success significantly, but will run much slower. What this does is that it will consider more image features, reduce tolerances, and spend more time pruning outliers.

Viewing the Cloud

The built in viewer gets a bit slow when you have millions of points but for the most part gets the job done. It uses WASD keys and the mouse to move around. Try using the mouse plus shift, cntrl, and alt keys and see what happens.

java -jar /path/to/boofcv/applications/applications.jar --GUI 

Then click on the viewer button.

Alternatively you can try 'osgviewer' but you might run into issues with it being too smart and auto scalling the scene. You will also want to turn off lighting (press 'l') so that colors are visible.

osgviewer /path/to/reconstruction/saved_cloud.ply