Search | arXiv e-print repository

doi 10.1007/s43681-022-00248-3

What should AI see? Using the Public's Opinion to Determine the Perception of an AI

Authors: Robin Chan, Radin Dardashti, Meike Osinski, Matthias Rottmann, Dominik Brüggemann, Cilia Rücker, Peter Schlicht, Fabian Hüger, Nikol Rummel, Hanno Gottschalk

Abstract: Deep neural networks (DNN) have made impressive progress in the interpretation of image data, so that it is conceivable and to some degree realistic to use them in safety critical applications like automated driving. From an ethical standpoint, the AI algorithm should take into account the vulnerability of objects or subjects on the street that ranges from "not at all", e.g. the road itself, to "h… ▽ More Deep neural networks (DNN) have made impressive progress in the interpretation of image data, so that it is conceivable and to some degree realistic to use them in safety critical applications like automated driving. From an ethical standpoint, the AI algorithm should take into account the vulnerability of objects or subjects on the street that ranges from "not at all", e.g. the road itself, to "high vulnerability" of pedestrians. One way to take this into account is to define the cost of confusion of one semantic category with another and use cost-based decision rules for the interpretation of probabilities, which are the output of DNNs. However, it is an open problem how to define the cost structure, who should be in charge to do that, and thereby define what AI-algorithms will actually "see". As one possible answer, we follow a participatory approach and set up an online survey to ask the public to define the cost structure. We present the survey design and the data acquired along with an evaluation that also distinguishes between perspective (car passenger vs. external traffic participant) and gender. Using simulation based $F$-tests, we find highly significant differences between the groups. These differences have consequences on the reliable detection of pedestrians in a safety critical distance to the self-driving car. We discuss the ethical problems that are related to this approach and also discuss the problems emerging from human-machine interaction through the survey from a psychological point of view. Finally, we include comments from industry leaders in the field of AI safety on the applicability of survey based elements in the design of AI functionalities in automated driving. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: 26 pages, 12 figures

Journal ref: AI and Ethics (2023)

arXiv:2204.13963 [pdf, other]

Tailored Uncertainty Estimation for Deep Learning Systems

Authors: Joachim Sicking, Maram Akila, Jan David Schneider, Fabian Hüger, Peter Schlicht, Tim Wirtz, Stefan Wrobel

Abstract: Uncertainty estimation bears the potential to make deep learning (DL) systems more reliable. Standard techniques for uncertainty estimation, however, come along with specific combinations of strengths and weaknesses, e.g., with respect to estimation quality, generalization abilities and computational complexity. To actually harness the potential of uncertainty quantification, estimators are requir… ▽ More Uncertainty estimation bears the potential to make deep learning (DL) systems more reliable. Standard techniques for uncertainty estimation, however, come along with specific combinations of strengths and weaknesses, e.g., with respect to estimation quality, generalization abilities and computational complexity. To actually harness the potential of uncertainty quantification, estimators are required whose properties closely match the requirements of a given use case. In this work, we propose a framework that, firstly, structures and shapes these requirements, secondly, guides the selection of a suitable uncertainty estimation method and, thirdly, provides strategies to validate this choice and to uncover structural weaknesses. By contributing tailored uncertainty estimation in this sense, our framework helps to foster trustworthy DL systems. Moreover, it anticipates prospective machine learning regulations that require, e.g., in the EU, evidences for the technical appropriateness of machine learning systems. Our framework provides such evidences for system components modeling uncertainty. △ Less

Submitted 29 April, 2022; originally announced April 2022.

arXiv:2106.05549 [pdf, other]

Validation of Simulation-Based Testing: Bypassing Domain Shift with Label-to-Image Synthesis

Authors: Julia Rosenzweig, Eduardo Brito, Hans-Ulrich Kobialka, Maram Akila, Nico M. Schmidt, Peter Schlicht, Jan David Schneider, Fabian Hüger, Matthias Rottmann, Sebastian Houben, Tim Wirtz

Abstract: Many machine learning applications can benefit from simulated data for systematic validation - in particular if real-life data is difficult to obtain or annotate. However, since simulations are prone to domain shift w.r.t. real-life data, it is crucial to verify the transferability of the obtained results. We propose a novel framework consisting of a generative label-to-image synthesis model toget… ▽ More Many machine learning applications can benefit from simulated data for systematic validation - in particular if real-life data is difficult to obtain or annotate. However, since simulations are prone to domain shift w.r.t. real-life data, it is crucial to verify the transferability of the obtained results. We propose a novel framework consisting of a generative label-to-image synthesis model together with different transferability measures to inspect to what extent we can transfer testing results of semantic segmentation models from synthetic data to equivalent real-life data. With slight modifications, our approach is extendable to, e.g., general multi-class classification tasks. Grounded on the transferability analysis, our approach additionally allows for extensive testing by incorporating controlled simulations. We validate our approach empirically on a semantic segmentation task on driving scenes. Transferability is tested using correlation analysis of IoU and a learned discriminator. Although the latter can distinguish between real-life and synthetic tests, in the former we observe surprisingly strong correlations of 0.7 for both cars and pedestrians. △ Less

Submitted 10 June, 2021; originally announced June 2021.

Comments: The first two authors contributed equally. Accepted at the 4th Workshop on "Ensuring and Validating Safety for Automated Vehicles" (WS13), IV2021. Under IEEE Copyright

arXiv:2104.09254 [pdf, other]

Plants Don't Walk on the Street: Common-Sense Reasoning for Reliable Semantic Segmentation

Authors: Linara Adilova, Elena Schulz, Maram Akila, Sebastian Houben, Jan David Schneider, Fabian Hueger, Tim Wirtz

Abstract: Data-driven sensor interpretation in autonomous driving can lead to highly implausible predictions as can most of the time be verified with common-sense knowledge. However, learning common knowledge only from data is hard and approaches for knowledge integration are an active research area. We propose to use a partly human-designed, partly learned set of rules to describe relations between objects… ▽ More Data-driven sensor interpretation in autonomous driving can lead to highly implausible predictions as can most of the time be verified with common-sense knowledge. However, learning common knowledge only from data is hard and approaches for knowledge integration are an active research area. We propose to use a partly human-designed, partly learned set of rules to describe relations between objects of a traffic scene on a high level of abstraction. In doing so, we improve and robustify existing deep neural networks consuming low-level sensor information. We present an initial study adapting the well-established Probabilistic Soft Logic (PSL) framework to validate and improve on the problem of semantic segmentation. We describe in detail how we integrate common knowledge into the segmentation pipeline using PSL and verify our approach in a set of experiments demonstrating the increase in robustness against several severe image distortions applied to the A2D2 autonomous driving data set. △ Less

Submitted 19 April, 2021; originally announced April 2021.

Comments: Published at SAIAD (Safe Artificial Intelligence for Automated Driving) workshop at CVPR2021

arXiv:2104.07538 [pdf, other]

Street-Map Based Validation of Semantic Segmentation in Autonomous Driving

Authors: Laura von Rueden, Tim Wirtz, Fabian Hueger, Jan David Schneider, Nico Piatkowski, Christian Bauckhage

Abstract: Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness, which motivates the thorough validation of learned models. However, current validation approaches mostly require ground truth data and are thus both cost-intensive and limited in their applicability. We propose to overcome these limitations by a model agnostic validation using a-priori knowledge… ▽ More Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness, which motivates the thorough validation of learned models. However, current validation approaches mostly require ground truth data and are thus both cost-intensive and limited in their applicability. We propose to overcome these limitations by a model agnostic validation using a-priori knowledge from street maps. In particular, we show how to validate semantic segmentation masks and demonstrate the potential of our approach using OpenStreetMap. We introduce validation metrics that indicate false positive or negative road segments. Besides the validation approach, we present a method to correct the vehicle's GPS position so that a more accurate localization can be used for the street-map based validation. Lastly, we present quantitative results on the Cityscapes dataset indicating that our validation approach can indeed uncover errors in semantic segmentation masks. △ Less

Submitted 15 April, 2021; originally announced April 2021.

Comments: Final version accepted at the International Conference on Pattern Recognition (ICPR). arXiv admin note: substantial text overlap with arXiv:2011.08008

arXiv:2101.03924 [pdf, other]

doi 10.1109/MSP.2020.2983666

The Vulnerability of Semantic Segmentation Networks to Adversarial Attacks in Autonomous Driving: Enhancing Extensive Environment Sensing

Authors: Andreas Bär, Jonas Löhdefink, Nikhil Kapoor, Serin J. Varghese, Fabian Hüger, Peter Schlicht, Tim Fingscheidt

Abstract: Enabling autonomous driving (AD) can be considered one of the biggest challenges in today's technology. AD is a complex task accomplished by several functionalities, with environment perception being one of its core functions. Environment perception is usually performed by combining the semantic information captured by several sensors, i.e., lidar or camera. The semantic information from the respe… ▽ More Enabling autonomous driving (AD) can be considered one of the biggest challenges in today's technology. AD is a complex task accomplished by several functionalities, with environment perception being one of its core functions. Environment perception is usually performed by combining the semantic information captured by several sensors, i.e., lidar or camera. The semantic information from the respective sensor can be extracted by using convolutional neural networks (CNNs) for dense prediction. In the past, CNNs constantly showed state-of-the-art performance on several vision-related tasks, such as semantic segmentation of traffic scenes using nothing but the red-green-blue (RGB) images provided by a camera. Although CNNs obtain state-of-the-art performance on clean images, almost imperceptible changes to the input, referred to as adversarial perturbations, may lead to fatal deception. The goal of this article is to illuminate the vulnerability aspects of CNNs used for semantic segmentation with respect to adversarial attacks, and share insights into some of the existing known adversarial defense strategies. We aim to clarify the advantages and disadvantages associated with applying CNNs for environment perception in AD to serve as a motivation for future research in this field. △ Less

Submitted 13 January, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

Comments: IEEE Signal Processing Magazine (Volume: 38, Issue: 1, Jan. 2021), pp. 42 - 52

arXiv:2101.02974 [pdf, other]

Approaching Neural Network Uncertainty Realism

Authors: Joachim Sicking, Alexander Kister, Matthias Fahrland, Stefan Eickeler, Fabian Hüger, Stefan Rüping, Peter Schlicht, Tim Wirtz

Abstract: Statistical models are inherently uncertain. Quantifying or at least upper-bounding their uncertainties is vital for safety-critical systems such as autonomous vehicles. While standard neural networks do not report this information, several approaches exist to integrate uncertainty estimates into them. Assessing the quality of these uncertainty estimates is not straightforward, as no direct ground… ▽ More Statistical models are inherently uncertain. Quantifying or at least upper-bounding their uncertainties is vital for safety-critical systems such as autonomous vehicles. While standard neural networks do not report this information, several approaches exist to integrate uncertainty estimates into them. Assessing the quality of these uncertainty estimates is not straightforward, as no direct ground truth labels are available. Instead, implicit statistical assessments are required. For regression, we propose to evaluate uncertainty realism -- a strict quality criterion -- with a Mahalanobis distance-based statistical test. An empirical evaluation reveals the need for uncertainty measures that are appropriate to upper-bound heavy-tailed empirical errors. Alongside, we transfer the variational U-Net classification architecture to standard supervised image-to-image tasks. We adopt it to the automotive domain and show that it significantly improves uncertainty realism compared to a plain encoder-decoder model. △ Less

Submitted 8 January, 2021; originally announced January 2021.

Comments: Accepted at the NeurIPS 2019 Workshop on Machine Learning for Autonomous Driving (ML4AD)

arXiv:2012.07504 [pdf, other]

Improving Video Instance Segmentation by Light-weight Temporal Uncertainty Estimates

Authors: Kira Maag, Matthias Rottmann, Serin Varghese, Fabian Hueger, Peter Schlicht, Hanno Gottschalk

Abstract: Instance segmentation with neural networks is an essential task in environment perception. In many works, it has been observed that neural networks can predict false positive instances with high confidence values and true positives with low ones. Thus, it is important to accurately model the uncertainties of neural networks in order to prevent safety issues and foster interpretability. In applicat… ▽ More Instance segmentation with neural networks is an essential task in environment perception. In many works, it has been observed that neural networks can predict false positive instances with high confidence values and true positives with low ones. Thus, it is important to accurately model the uncertainties of neural networks in order to prevent safety issues and foster interpretability. In applications such as automated driving, the reliability of neural networks is of highest interest. In this paper, we present a time-dynamic approach to model uncertainties of instance segmentation networks and apply this to the detection of false positives as well as the estimation of prediction quality. The availability of image sequences in online applications allows for tracking instances over multiple frames. Based on an instances history of shape and uncertainty information, we construct temporal instance-wise aggregated metrics. The latter are used as input to post-processing models that estimate the prediction quality in terms of instance-wise intersection over union. The proposed method only requires a readily trained neural network (that may operate on single frames) and video sequence input. In our experiments, we further demonstrate the use of the proposed method by replacing the traditional score value from object detection and thereby improving the overall performance of the instance segmentation network. △ Less

Submitted 13 April, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

arXiv:2012.01558 [pdf, other]

From a Fourier-Domain Perspective on Adversarial Examples to a Wiener Filter Defense for Semantic Segmentation

Authors: Nikhil Kapoor, Andreas Bär, Serin Varghese, Jan David Schneider, Fabian Hüger, Peter Schlicht, Tim Fingscheidt

Abstract: Despite recent advancements, deep neural networks are not robust against adversarial perturbations. Many of the proposed adversarial defense approaches use computationally expensive training mechanisms that do not scale to complex real-world tasks such as semantic segmentation, and offer only marginal improvements. In addition, fundamental questions on the nature of adversarial perturbations and t… ▽ More Despite recent advancements, deep neural networks are not robust against adversarial perturbations. Many of the proposed adversarial defense approaches use computationally expensive training mechanisms that do not scale to complex real-world tasks such as semantic segmentation, and offer only marginal improvements. In addition, fundamental questions on the nature of adversarial perturbations and their relation to the network architecture are largely understudied. In this work, we study the adversarial problem from a frequency domain perspective. More specifically, we analyze discrete Fourier transform (DFT) spectra of several adversarial images and report two major findings: First, there exists a strong connection between a model architecture and the nature of adversarial perturbations that can be observed and addressed in the frequency domain. Second, the observed frequency patterns are largely image- and attack-type independent, which is important for the practical impact of any defense making use of such patterns. Motivated by these findings, we additionally propose an adversarial defense method based on the well-known Wiener filters that captures and suppresses adversarial frequencies in a data-driven manner. Our proposed method not only generalizes across unseen attacks but also beats five existing state-of-the-art methods across two models in a variety of attack settings. △ Less

Submitted 21 April, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

Comments: Accepted by The International Joint Conference on Neural Network (IJCNN) 2021

arXiv:2012.01386 [pdf, other]

doi 10.1145/3385958.3430477

A Self-Supervised Feature Map Augmentation (FMA) Loss and Combined Augmentations Finetuning to Efficiently Improve the Robustness of CNNs

Authors: Nikhil Kapoor, Chun Yuan, Jonas Löhdefink, Roland Zimmermann, Serin Varghese, Fabian Hüger, Nico Schmidt, Peter Schlicht, Tim Fingscheidt

Abstract: Deep neural networks are often not robust to semantically-irrelevant changes in the input. In this work we address the issue of robustness of state-of-the-art deep convolutional neural networks (CNNs) against commonly occurring distortions in the input such as photometric changes, or the addition of blur and noise. These changes in the input are often accounted for during training in the form of d… ▽ More Deep neural networks are often not robust to semantically-irrelevant changes in the input. In this work we address the issue of robustness of state-of-the-art deep convolutional neural networks (CNNs) against commonly occurring distortions in the input such as photometric changes, or the addition of blur and noise. These changes in the input are often accounted for during training in the form of data augmentation. We have two major contributions: First, we propose a new regularization loss called feature-map augmentation (FMA) loss which can be used during finetuning to make a model robust to several distortions in the input. Second, we propose a new combined augmentations (CA) finetuning strategy, that results in a single model that is robust to several augmentation types at the same time in a data-efficient manner. We use the CA strategy to improve an existing state-of-the-art method called stability training (ST). Using CA, on an image classification task with distorted images, we achieve an accuracy improvement of on average 8.94% with FMA and 8.86% with ST absolute on CIFAR-10 and 8.04% with FMA and 8.27% with ST absolute on ImageNet, compared to 1.98% and 2.12%, respectively, with the well known data augmentation method, while keeping the clean baseline performance. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Comments: Accepted at ACM CSCS 2020 (8 pages, 4 figures)

arXiv:2011.08008 [pdf, other]

Towards Map-Based Validation of Semantic Segmentation Masks

Authors: Laura von Rueden, Tim Wirtz, Fabian Hueger, Jan David Schneider, Christian Bauckhage

Abstract: Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness. We propose to validate machine learning models for self-driving vehicles not only with given ground truth labels, but also with additional a-priori knowledge. In particular, we suggest to validate the drivable area in semantic segmentation masks using given street map data. We present first resul… ▽ More Artificial intelligence for autonomous driving must meet strict requirements on safety and robustness. We propose to validate machine learning models for self-driving vehicles not only with given ground truth labels, but also with additional a-priori knowledge. In particular, we suggest to validate the drivable area in semantic segmentation masks using given street map data. We present first results, which indicate that prediction errors can be uncovered by map-based validation. △ Less

Submitted 26 November, 2020; v1 submitted 3 November, 2020; originally announced November 2020.

arXiv:2011.04328 [pdf, other]

Risk Assessment for Machine Learning Models

Authors: Paul Schwerdtner, Florens Greßner, Nikhil Kapoor, Felix Assion, René Sass, Wiebke Günther, Fabian Hüger, Peter Schlicht

Abstract: In this paper we propose a framework for assessing the risk associated with deploying a machine learning model in a specified environment. For that we carry over the risk definition from decision theory to machine learning. We develop and implement a method that allows to define deployment scenarios, test the machine learning model under the conditions specified in each scenario, and estimate the… ▽ More In this paper we propose a framework for assessing the risk associated with deploying a machine learning model in a specified environment. For that we carry over the risk definition from decision theory to machine learning. We develop and implement a method that allows to define deployment scenarios, test the machine learning model under the conditions specified in each scenario, and estimate the damage associated with the output of the machine learning model under test. Using the likelihood of each scenario together with the estimated damage we define \emph{key risk indicators} of a machine learning model. The definition of scenarios and weighting by their likelihood allows for standardized risk assessment in machine learning throughout multiple domains of application. In particular, in our framework, the robustness of a machine learning model to random input corruptions, distributional shifts caused by a changing environment, and adversarial perturbations can be assessed. △ Less

Submitted 9 November, 2020; originally announced November 2020.

Comments: 8 pages, 5 figures, conference workshop

arXiv:2006.08613 [pdf, other]

Self-Supervised Domain Mismatch Estimation for Autonomous Perception

Authors: Jonas Löhdefink, Justin Fehrling, Marvin Klingner, Fabian Hüger, Peter Schlicht, Nico M. Schmidt, Tim Fingscheidt

Abstract: Autonomous driving requires self awareness of its perception functions. Technically spoken, this can be realized by observers, which monitor the performance indicators of various perception modules. In this work we choose, exemplarily, a semantic segmentation to be monitored, and propose an autoencoder, trained in a self-supervised fashion on the very same training data as the semantic segmentatio… ▽ More Autonomous driving requires self awareness of its perception functions. Technically spoken, this can be realized by observers, which monitor the performance indicators of various perception modules. In this work we choose, exemplarily, a semantic segmentation to be monitored, and propose an autoencoder, trained in a self-supervised fashion on the very same training data as the semantic segmentation to be monitored. While the autoencoder's image reconstruction performance (PSNR) during online inference shows already a good predictive power w.r.t. semantic segmentation performance, we propose a novel domain mismatch metric DM as the earth mover's distance between a pre-stored PSNR distribution on training (source) data, and an online-acquired PSNR distribution on any inference (target) data. We are able to show by experiments that the DM metric has a strong rank order correlation with the semantic segmentation within its functional scope. We also propose a training domain-dependent threshold for the DM metric to define this functional scope. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: Proc. of CVPR - Workshops

arXiv:2002.08935 [pdf, other]

Strategy to Increase the Safety of a DNN-based Perception for HAD Systems

Authors: Timo Sämann, Peter Schlicht, Fabian Hüger

Abstract: Safety is one of the most important development goals for highly automated driving (HAD) systems. This applies in particular to the perception function driven by Deep Neural Networks (DNNs). For these, large parts of the traditional safety processes and requirements are not fully applicable or sufficient. The aim of this paper is to present a framework for the description and mitigation of DNN ins… ▽ More Safety is one of the most important development goals for highly automated driving (HAD) systems. This applies in particular to the perception function driven by Deep Neural Networks (DNNs). For these, large parts of the traditional safety processes and requirements are not fully applicable or sufficient. The aim of this paper is to present a framework for the description and mitigation of DNN insufficiencies and the derivation of relevant safety mechanisms to increase the safety of DNNs. To assess the effectiveness of these safety mechanisms, we present a categorization scheme for evaluation metrics. △ Less

Submitted 20 February, 2020; originally announced February 2020.

arXiv:1912.07420 [pdf, other]

MetaFusion: Controlled False-Negative Reduction of Minority Classes in Semantic Segmentation

Authors: Robin Chan, Matthias Rottmann, Fabian Hüger, Peter Schlicht, Hanno Gottschalk

Abstract: In semantic segmentation datasets, classes of high importance are oftentimes underrepresented, e.g., humans in street scenes. Neural networks are usually trained to reduce the overall number of errors, attaching identical loss to errors of all kinds. However, this is not necessarily aligned with human intuition. For instance, an overlooked pedestrian seems more severe than an incorrectly detected… ▽ More In semantic segmentation datasets, classes of high importance are oftentimes underrepresented, e.g., humans in street scenes. Neural networks are usually trained to reduce the overall number of errors, attaching identical loss to errors of all kinds. However, this is not necessarily aligned with human intuition. For instance, an overlooked pedestrian seems more severe than an incorrectly detected one. One possible remedy is to deploy different decision rules by introducing class priors which assigns larger weight to underrepresented classes. While reducing the false-negatives of the underrepresented class, at the same time this leads to a considerable increase of false-positive indications. In this work, we combine decision rules with methods for false-positive detection. We therefore fuse false-negative detection with uncertainty based false-positive meta classification. We present proof-of-concept results for CIFAR-10, and prove the efficiency of our method for the semantic segmentation of street scenes on the Cityscapes dataset based on predicted instances of the 'human' class. In the latter we employ an advanced false-positive detection method using uncertainty measures aggregated over instances. We thereby achieve improved trade-offs between false-negative and false-positive samples of the underrepresented classes. △ Less

Submitted 16 December, 2019; originally announced December 2019.

arXiv:1912.03673 [pdf, other]

Detection of False Positive and False Negative Samples in Semantic Segmentation

Authors: Matthias Rottmann, Kira Maag, Robin Chan, Fabian Hüger, Peter Schlicht, Hanno Gottschalk

Abstract: In recent years, deep learning methods have outperformed other methods in image recognition. This has fostered imagination of potential application of deep learning technology including safety relevant applications like the interpretation of medical images or autonomous driving. The passage from assistance of a human decision maker to ever more automated systems however increases the need to prope… ▽ More In recent years, deep learning methods have outperformed other methods in image recognition. This has fostered imagination of potential application of deep learning technology including safety relevant applications like the interpretation of medical images or autonomous driving. The passage from assistance of a human decision maker to ever more automated systems however increases the need to properly handle the failure modes of deep learning modules. In this contribution, we review a set of techniques for the self-monitoring of machine-learning algorithms based on uncertainty quantification. In particular, we apply this to the task of semantic segmentation, where the machine learning algorithm decomposes an image according to semantic categories. We discuss false positive and false negative error modes at instance-level and review techniques for the detection of such errors that have been recently proposed by the authors. We also give an outlook on future research directions. △ Less

Submitted 8 December, 2019; originally announced December 2019.

MSC Class: 68T45; 62-07

arXiv:1907.01342 [pdf, other]

The Ethical Dilemma when (not) Setting up Cost-based Decision Rules in Semantic Segmentation

Authors: Robin Chan, Matthias Rottmann, Radin Dardashti, Fabian Hüger, Peter Schlicht, Hanno Gottschalk

Abstract: Neural networks for semantic segmentation can be seen as statistical models that provide for each pixel of one image a probability distribution on predefined classes. The predicted class is then usually obtained by the maximum a-posteriori probability (MAP) which is known as Bayes rule in decision theory. From decision theory we also know that the Bayes rule is optimal regarding the simple symmetr… ▽ More Neural networks for semantic segmentation can be seen as statistical models that provide for each pixel of one image a probability distribution on predefined classes. The predicted class is then usually obtained by the maximum a-posteriori probability (MAP) which is known as Bayes rule in decision theory. From decision theory we also know that the Bayes rule is optimal regarding the simple symmetric cost function. Therefore, it weights each type of confusion between two different classes equally, e.g., given images of urban street scenes there is no distinction in the cost function if the network confuses a person with a street or a building with a tree. Intuitively, there might be confusions of classes that are more important to avoid than others. In this work, we want to raise awareness of the possibility of explicitly defining confusion costs and the associated ethical difficulties if it comes down to providing numbers. We define two cost functions from different extreme perspectives, an egoistic and an altruistic one, and show how safety relevant quantities like precision / recall and (segment-wise) false positive / negative rate change when interpolating between MAP, egoistic and altruistic decision rules. △ Less

Submitted 2 July, 2019; originally announced July 2019.

MSC Class: 68T45; 62-07; 62C05

Journal ref: CVPR-W SAIAD 2019

arXiv:1906.07077 [pdf, other]

The Attack Generator: A Systematic Approach Towards Constructing Adversarial Attacks

Authors: Felix Assion, Peter Schlicht, Florens Greßner, Wiebke Günther, Fabian Hüger, Nico Schmidt, Umair Rasheed

Abstract: Most state-of-the-art machine learning (ML) classification systems are vulnerable to adversarial perturbations. As a consequence, adversarial robustness poses a significant challenge for the deployment of ML-based systems in safety- and security-critical environments like autonomous driving, disease detection or unmanned aerial vehicles. In the past years we have seen an impressive amount of publi… ▽ More Most state-of-the-art machine learning (ML) classification systems are vulnerable to adversarial perturbations. As a consequence, adversarial robustness poses a significant challenge for the deployment of ML-based systems in safety- and security-critical environments like autonomous driving, disease detection or unmanned aerial vehicles. In the past years we have seen an impressive amount of publications presenting more and more new adversarial attacks. However, the attack research seems to be rather unstructured and new attacks often appear to be random selections from the unlimited set of possible adversarial attacks. With this publication, we present a structured analysis of the adversarial attack creation process. By detecting different building blocks of adversarial attacks, we outline the road to new sets of adversarial attacks. We call this the "attack generator". In the pursuit of this objective, we summarize and extend existing adversarial perturbation taxonomies. The resulting taxonomy is then linked to the application context of computer vision systems for autonomous vehicles, i.e. semantic segmentation and object detection. Finally, in order to prove the usefulness of the attack generator, we investigate existing semantic segmentation attacks with respect to the detected defining components of adversarial attacks. △ Less

Submitted 17 June, 2019; originally announced June 2019.

Comments: CVPR SAIAD - Workshop 2019

arXiv:1902.04311 [pdf, other]

GAN- vs. JPEG2000 Image Compression for Distributed Automotive Perception: Higher Peak SNR Does Not Mean Better Semantic Segmentation

Authors: Jonas Löhdefink, Andreas Bär, Nico M. Schmidt, Fabian Hüger, Peter Schlicht, Tim Fingscheidt

Abstract: The high amount of sensors required for autonomous driving poses enormous challenges on the capacity of automotive bus systems. There is a need to understand tradeoffs between bitrate and perception performance. In this paper, we compare the image compression standards JPEG, JPEG2000, and WebP to a modern encoder/decoder image compression approach based on generative adversarial networks (GANs). W… ▽ More The high amount of sensors required for autonomous driving poses enormous challenges on the capacity of automotive bus systems. There is a need to understand tradeoffs between bitrate and perception performance. In this paper, we compare the image compression standards JPEG, JPEG2000, and WebP to a modern encoder/decoder image compression approach based on generative adversarial networks (GANs). We evaluate both the pure compression performance using typical metrics such as peak signal-to-noise ratio (PSNR), structural similarity (SSIM) and others, but also the performance of a subsequent perception function, namely a semantic segmentation (characterized by the mean intersection over union (mIoU) measure). Not surprisingly, for all investigated compression methods, a higher bitrate means better results in all investigated quality metrics. Interestingly, however, we show that the semantic segmentation mIoU of the GAN autoencoder in the highly relevant low-bitrate regime (at 0.0625 bit/pixel) is better by 3.9% absolute than JPEG2000, although the latter still is considerably better in terms of PSNR (5.91 dB difference). This effect can greatly be enlarged by training the semantic segmentation model with images originating from the decoder, so that the mIoU using the segmentation model trained by GAN reconstructions exceeds the use of the model trained with original images by almost 20% absolute. We conclude that distributed perception in future autonomous driving will most probably not provide a solution to the automotive bus capacity bottleneck by using standard compression schemes such as JPEG2000, but requires modern coding approaches, with the GAN encoder/decoder method being a promising candidate. △ Less

Submitted 12 February, 2019; originally announced February 2019.

arXiv:1901.08394 [pdf, other]

Application of Decision Rules for Handling Class Imbalance in Semantic Segmentation

Authors: Robin Chan, Matthias Rottmann, Fabian Hüger, Peter Schlicht, Hanno Gottschalk

Abstract: As part of autonomous car driving systems, semantic segmentation is an essential component to obtain a full understanding of the car's environment. One difficulty, that occurs while training neural networks for this purpose, is class imbalance of training data. Consequently, a neural network trained on unbalanced data in combination with maximum a-posteriori classification may easily ignore classe… ▽ More As part of autonomous car driving systems, semantic segmentation is an essential component to obtain a full understanding of the car's environment. One difficulty, that occurs while training neural networks for this purpose, is class imbalance of training data. Consequently, a neural network trained on unbalanced data in combination with maximum a-posteriori classification may easily ignore classes that are rare in terms of their frequency in the dataset. However, these classes are often of highest interest. We approach such potential misclassifications by weighting the posterior class probabilities with the prior class probabilities which in our case are the inverse frequencies of the corresponding classes in the training dataset. More precisely, we adopt a localized method by computing the priors pixel-wise such that the impact can be analyzed at pixel level as well. In our experiments, we train one network from scratch using a proprietary dataset containing 20,000 annotated frames of video sequences recorded from street scenes. The evaluation on our test set shows an increase of average recall with regard to instances of pedestrians and info signs by $25\%$ and $23.4\%$, respectively. In addition, we significantly reduce the non-detection rate for instances of the same classes by $61\%$ and $38\%$. △ Less

Submitted 24 January, 2019; originally announced January 2019.

Comments: 11 pages, 21 figures

MSC Class: 68T45; 62-07; 62C05

arXiv:1811.00648 [pdf, other]

Prediction Error Meta Classification in Semantic Segmentation: Detection via Aggregated Dispersion Measures of Softmax Probabilities

Authors: Matthias Rottmann, Pascal Colling, Thomas-Paul Hack, Robin Chan, Fabian Hüger, Peter Schlicht, Hanno Gottschalk

Abstract: We present a method that "meta" classifies whether seg-ments predicted by a semantic segmentation neural networkintersect with the ground truth. For this purpose, we employ measures of dispersion for predicted pixel-wise class probability distributions, like classification entropy, that yield heat maps of the input scene's size. We aggregate these dispersion measures segment-wise and derive metric… ▽ More We present a method that "meta" classifies whether seg-ments predicted by a semantic segmentation neural networkintersect with the ground truth. For this purpose, we employ measures of dispersion for predicted pixel-wise class probability distributions, like classification entropy, that yield heat maps of the input scene's size. We aggregate these dispersion measures segment-wise and derive metrics that are well-correlated with the segment-wise IoU of prediction and ground truth. This procedure yields an almost plug and play post-processing tool to rate the prediction quality of semantic segmentation networks on segment level. This is especially relevant for monitoring neural networks in online applications like automated driving or medical imaging where reliability is of utmost importance. In our tests, we use publicly available state-of-the-art networks trained on the Cityscapes dataset and the BraTS2017 dataset and analyze the predictive power of different metrics as well as different sets of metrics. To this end, we compute logistic LASSO regression fits for the task of classifying IoU=0 vs. IoU>0 per segment and obtain AUROC values of up to 91.55%. We complement these tests with linear regression fits to predict the segment-wise IoU and obtain prediction standard deviations of down to 0.130 as well as $R^2$ values of up to 84.15%. We show that these results clearly outperform standard approaches. △ Less

Submitted 2 October, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

MSC Class: 68T45; 62-07

arXiv:1807.03210 [pdf, other]

Efficient Decentralized Deep Learning by Dynamic Model Averaging

Authors: Michael Kamp, Linara Adilova, Joachim Sicking, Fabian Hüger, Peter Schlicht, Tim Wirtz, Stefan Wrobel

Abstract: We propose an efficient protocol for decentralized training of deep neural networks from distributed data sources. The proposed protocol allows to handle different phases of model training equally well and to quickly adapt to concept drifts. This leads to a reduction of communication by an order of magnitude compared to periodically communicating state-of-the-art approaches. Moreover, we derive a… ▽ More We propose an efficient protocol for decentralized training of deep neural networks from distributed data sources. The proposed protocol allows to handle different phases of model training equally well and to quickly adapt to concept drifts. This leads to a reduction of communication by an order of magnitude compared to periodically communicating state-of-the-art approaches. Moreover, we derive a communication bound that scales well with the hardness of the serialized learning problem. The reduction in communication comes at almost no cost, as the predictive performance remains virtually unchanged. Indeed, the proposed protocol retains loss bounds of periodically averaging schemes. An extensive empirical evaluation validates major improvement of the trade-off between model performance and communication which could be beneficial for numerous decentralized learning applications, such as autonomous driving, or voice recognition and image classification on mobile phones. △ Less

Submitted 13 November, 2018; v1 submitted 9 July, 2018; originally announced July 2018.

Showing 1–22 of 22 results for author: Hüger, F