Simulation and Intermediate Representations for Visual Inspection

Penk, Dominik

Simulation and Intermediate Representations for Visual Inspection

Files

Dissertation_dominik_penk.pdf (28.63 MB)

Language

en

Document Type

Doctoral Thesis

Granting Institution

Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Technische Fakultät

Issue Date

2024

Authors

Penk, Dominik

Abstract

The extraction of information from visual data is widely used in academic and industrial applications. These visual inspection tasks often require significant manual effort, even when performed by experienced users. Consequently, (semi-)automated solutions are highly sought after, as they reduce the potential of human error and save time. In this work, we present solutions to specific tasks that exhibit a higher degree of automation than state-of-the-art solutions. At the same time, we aim to require as little domain expertise as possible for the productive use of our solutions.

In the first example, we use interactive volume rendering to visualize the preprocessing and filtering of large, homogeneous atom point clouds. Our approach allows inexperienced users to quickly examine the point clouds and identify the atom types that exhibit interesting structures. Additionally, we demonstrate how this preprocessing can be used to extract complex surface structures fully automatically from the point cloud. Furthermore, we present an efficient and non-destructive method for reconstructing specular surfaces that can be integrated directly into a production line. This makes our solution particularly relevant for industrial applications.

In addition to classical approaches, this work also investigates two learning-based applications. Here, the main part of the manual labor is expected to be in collecting and annotating training data rather than in the productive use of the models. Therefore, we introduce two processes for generating synthetic training data that can be used instead of real data, requiring only a CAD model of the target objects. We show that the quality loss generated by the inevitable simulation-to-reality gap can be minimized if the network is provided with the right type of data. For example, we show that a 6D pose estimator that works with depth maps can be easily trained on synthetic data. Finally, we present a pipeline where color images are first transformed into an abstract line representation. We show that various image-based tasks can be trained and solved using this representation with synthetic data.

Overall, this dissertation contributes to the field of visual inspection by introducing new methods that are more automated, less reliant on expert knowledge, and time-saving compared to existing solutions.