Semantic Segmentation for 3D Feature Detection in the Automation of High Mix Industrial Processes

M. Powelson
Southwest Research Institute,
United States

Keywords: semantic segmentation, high mix, welding, 3D, ROS, 3D reconstruction


In high mix industrial applications, detecting key features in 3D space is often an important automation step. This research demonstrates the use of 2D image processing techniques such as machine learning to detect these 3D features, taking advantage of the recent explosion and growing maturity of 2D image processing techniques and deep learning. While 2D semantic segmentation algorithms are becoming commonplace, semantically labeling objects in 3D space is a much less mature field. Further, while collecting accurately labeled 3D data can often be difficult, requiring either a 3D camera and a tedious labeling process or a complicated simulation, techniques that are based on 2D images can often be developed using informally collected images – for example with a cell phone camera on the shop floor. Once the data is acquired, there are many tools and commercial services that allow easy labeling and training of 2D detection techniques off the shelf. Therefore, this research demonstrates an approach wherein a 2D image classifier is used to annotate 3D data on the fly that is then aggregated over the course of a scan in order to generate a resulting semantically labeled 3D mesh. The system is designed to be agnostic to the 2D classification method, allowing users to use state of the art semantic segmentation methods without changes to the scanning pipeline. Further, it is incorporated into a common open source robotics software suite known as ROS-Industrial (ROS-I) to allow interoperability with different robotics hardware, depth cameras, and motion planning and reconstruction algorithms. The example application selected was a high mix welding application. In such an application, the robot scans an arbitrarily positioned part to be welded using a depth camera and performs a reconstruction using a ROS-I reconstruction technique. This provides a mesh of the part to be welded but no information as the nature of regions of the mesh. However, as the robot is scanning, 2D images are classified using a robust semantic segmentation algorithm – in this case FCN8 – that is used to label the 3D data from the camera. This data is aggregated independently of the reconstruction using an octree based occupancy grid. Upon completion of the scan, the occupancy grid is used to annotate the 3D mesh with regions that are to be welded. With a semantically labeled mesh, a weld path can easily be fit to regions classified as weld seams using techniques such as RANSAC or ICP. This technique was demonstrated on an ABB irb2400 industrial robot with an end-effector mounted Intel Realsense D435 depth camera. The software has been released to the open source community ( under the permissive Apache 2.0 license. In addition to the described welding application, it has been deployed on a high mix painting application detecting and removing masking material.