3D Object Detection

December 9, 2021

The ability to perceive and localize objects is central to many robotics and computer vision applications. We recently added 3D object detection to the Stray Vision SDK to provide an easy way for robots to perceive objects. In this post, we give you a quick overview of what 3D object detection is and what you can do with it.

3D Object Detection detects objects from images. Regular object detection gives you a bounding box where the target object is in the image. In 3D object detection, we not only detect where in the image an object is, but also how far away it is and what it's pose is, relative to the camera.

Specifically, given a single color image, 3D object detection gives you:

  • x, y, z coordinates in meters for the center of the object
  • A bounding box with width, height and depth.
  • The 3D orientation of the bounding box

Here is what that information might look like when overlaid on an image. Not only is the entire object inside the box, the box is consistently oriented such that we know which way is up and on which side the handle is.

Ground truth label for a cup.

A practical example

To showcase this, we trained a 3D bounding box detection system to detect cups in a scene along with their orientation. While this example is on a table top scale with small objects, the same system would work equally well on larger or smaller objects.

We start by collecting a dataset to work with that is representative of the types of situations in which we would like to localize our cups. For this, we scanned 20 different cups with the Stray Scanner app.

As usual, we import scenes with the stray dataset import command into our dataset folder. We then add 20 labels to the scenes using Stray Studio. Here is what one of those labeled scenes look like.

The labeled scene in 3D Studio.

Once the scenes have been imported and labeled, we simply train a model using the stray model bake command. We then test the model using stray model eval. Here is what we end up with:

As we can see, the model is able to predict the 6D pose quite nicely.The cups were not included in the training set, so the model is also able to generalize to unseen cups and scenes.

Try it out for yourself

The new 3D detection is currently being tested with a handful of beta customers, if you'd like to be one of them and have a use case in mind, do not hesitate to reach out, we'd very much like to hear from you! You can email us at hello@strayrobots.io.

If you enjoyed this post, give us a like on LinkedIn or subscribe to our newsletter to follow us as we develop a simple to use toolkit for solving computer vision problems.

Other Blog Posts