[MMDet3D] How to Infer — Plotting

jun94
jun-devpBlog
Published in
3 min readMar 31, 2024

--

In this article, we will go over how to use MMDet3D in order to retrieve inference/predictions from a trained model.

Typically, inference results include raw prediction (coordinates of individual 3D bounding boxes) and visualization (3D bounding boxes plotted on input image/point cloud).

As this article assumes readers to have MMDet3D installed in their environment, I leave a link where one can find detailed guide regarding installation of MMDet3D for those who haven’t done it yet.

For the sake of simplicity, I used FCOS3D as a 3D detector and MMDet3D ver1.4.0, but the general flow should remain more or less the same for the other detectors and versions.

  • Note that this article does not address theoretical or practical parts of the monocular 3D detector, FCOS3D as it is beyond the scope.
  • One can simply accept it as a 3D detector which takes a single image as an input and outputs a 3D bounding boxes for objects presented in the image.

After this tutorial, you would be able to obtain images like below on your machine.

Inference results of FCOS3D, plotted on input images

How to Obtain Visualized Outputs?

While retrieving the 3D box proposals from a detector and overlaying them on an image are not a simple task (at least, not a few lines of coding), fortunately MMDet3D provides built-in method(s) to make this job easy.

The ready-made APIs support various types of inputs, e.g., image and point cloud.

  • Drawing point cloud on the image
  • Drawing 3D Boxes on Point Cloud
  • Drawing Projected 3D Boxes on Image

As APIs for point cloud as an input are kindly explained in many other articles, I will focus on when an image is given as input.

The main workflow is just the same as when it was point cloud, i.e., we will use tools/test.py but with different arguments. The entire command is as simple as below.

cd YOUR_PROJECT_ROOT
python mmdetection3d/tools/test.py mmdetection3d/configs/fcos3d/fcos3d_r101-caffe-dcn_fpn_head-gn_8xb2-1x_nus-mono3d_finetune.py ckpt/fcos3d/fcos3d_r101_caffe_fpn_gn-head_dcn_2x8_1x_nus-mono3d_finetune_20210717_095645-8d806dc2.pth --show --show-dir result/fcos3d --task mono_det

Note that the command above has several prerequisites.

  1. mmdetection3d(MMDet3D) is placed right under your project ROOT.
  2. nuScenes Dataset is downloaded (or symbolic-linked) and placed under PROJECT_ROOT/data, as described in here. Make sure that the relavant pkl files are also generated from original nuScenes Dataset following the section ‘Dataset Preperation’ of the link, the command would not run otherwise.
  3. As we are using FCOS3D as a 3D detector, ckpt file from which MMDet3D loads the pre-trained weights should be downloaded in advance.
  • In my case, I downloaded the weight from here and placed it under PROJECT_ROOT/ckpt/fcos3d.
  • See the directory structure below that I used to ran this tutorial.
Directory Structure

Once the prerequsites were met and the command ran, you can find the inference results of FCOS3D given nuScenes image under PROJECT_ROOT/work_dirs/CONFIG_NAME/DATE/result/fcos3d.

  • CONFIG_NAME is fcos3d_r101-caffe-dcn_fpn_head-gn_8xb2–1x_nus-mono3d_finetune, unless you specified otherwise in the command.
  • DATE is when you ran the command.

So far, I have addressed how to plot the inference results upon the input image, but how to retrieve each coordinates of predicted bounding boxes is unhandled. In fact, to achieve that we need different workflow.

I will talk about that in the next chapter.

Any corrections, suggestions, and comments are welcome

Reference

[1] MMDet3D Installation

[2] MMDet3D — Visualization

--

--