Search | arXiv e-print repository

Attention Modules Improve Modern Image-Level Anomaly Detection: A DifferNet Case Study

Authors: André Luiz B. Vieira e Silva, Francisco Simões, Danny Kowerko, Tobias Schlosser, Felipe Battisti, Veronica Teichrieb

Abstract: Within (semi-)automated visual inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To not only alleviate this issue but to furthermore adva… ▽ More Within (semi-)automated visual inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To not only alleviate this issue but to furthermore advance the current state of the art in unsupervised visual inspection, this contribution proposes a DifferNet-based solution enhanced with attention modules utilizing SENet and CBAM as backbone - AttentDifferNet - to improve the detection and classification capabilities on three different visual inspection and anomaly detection datasets: MVTec AD, InsPLAD-fault, and Semiconductor Wafer. In comparison to the current state of the art, it is shown that AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quantitative as well as qualitative evaluation, indicated by a general improvement in AUC of 94.34 vs. 92.46, 96.67 vs. 94.69, and 90.20 vs. 88.74%. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for anomaly detection. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: Accepted to CVPRW 2023: VISION'23 - 1st workshop on Vision-based InduStrial InspectiON (Extended Abstract). arXiv admin note: substantial text overlap with arXiv:2311.02747

arXiv:2311.02747 [pdf, other]

Attention Modules Improve Image-Level Anomaly Detection for Industrial Inspection: A DifferNet Case Study

Authors: André Luiz Buarque Vieira e Silva, Francisco Simões, Danny Kowerko, Tobias Schlosser, Felipe Battisti, Veronica Teichrieb

Abstract: Within (semi-)automated visual industrial inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To alleviate this issue and advance the curre… ▽ More Within (semi-)automated visual industrial inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To alleviate this issue and advance the current state of the art in unsupervised visual inspection, this work proposes a DifferNet-based solution enhanced with attention modules: AttentDifferNet. It improves image-level detection and classification capabilities on three visual anomaly detection datasets for industrial inspection: InsPLAD-fault, MVTec AD, and Semiconductor Wafer. In comparison to the state of the art, AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quali-quantitative study. Our quantitative evaluation shows an average improvement - compared to DifferNet - of 1.77 +/- 0.25 percentage points in overall AUROC considering all three datasets, reaching SOTA results in InsPLAD-fault, an industrial inspection in-the-wild dataset. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for industrial anomaly detection both in the wild and in controlled environments. △ Less

Submitted 7 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

Comments: Accepted at WACV 2024

arXiv:2204.11970 [pdf, other]

doi 10.1038/s41598-024-54482-2

Visual Acuity Prediction on Real-Life Patient Data Using a Machine Learning Based Multistage System

Authors: Tobias Schlosser, Frederik Beuth, Trixy Meyer, Arunodhayan Sampath Kumar, Gabriel Stolze, Olga Furashova, Katrin Engelmann, Danny Kowerko

Abstract: In ophthalmology, intravitreal operative medication therapy (IVOM) is a widespread treatment for diseases related to the age-related macular degeneration (AMD), the diabetic macular edema (DME), as well as the retinal vein occlusion (RVO). However, in real-world settings, patients often suffer from loss of vision on time scales of years despite therapy, whereas the prediction of the visual acuity… ▽ More In ophthalmology, intravitreal operative medication therapy (IVOM) is a widespread treatment for diseases related to the age-related macular degeneration (AMD), the diabetic macular edema (DME), as well as the retinal vein occlusion (RVO). However, in real-world settings, patients often suffer from loss of vision on time scales of years despite therapy, whereas the prediction of the visual acuity (VA) and the earliest possible detection of deterioration under real-life conditions is challenging due to heterogeneous and incomplete data. In this contribution, we present a workflow for the development of a research-compatible data corpus fusing different IT systems of the department of ophthalmology of a German maximum care hospital. The extensive data corpus allows predictive statements of the expected progression of a patient and his or her VA in each of the three diseases. For the disease AMD, we found out a significant deterioration of the visual acuity over time. Within our proposed multistage system, we subsequently classify the VA progression into the three groups of therapy "winners", "stabilizers", and "losers" (WSL classification scheme). Our OCT biomarker classification using an ensemble of deep neural networks results in a classification accuracy (F1-score) of over 98 %, enabling us to complete incomplete OCT documentations while allowing us to exploit them for a more precise VA modeling process. Our VA prediction requires at least four VA examinations and optionally OCT biomarkers from the same time period to predict the VA progression within a forecasted time frame, whereas our prediction is currently restricted to IVOM / no therapy. We achieve a final prediction accuracy of 69 % in macro average F1-score, while being in the same range as the ophthalmologists with 57.8 and 50 +- 10.7 % F1-score. △ Less

Submitted 7 June, 2024; v1 submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted for: Scientific Reports

arXiv:2102.06955 [pdf, other]

Improving Automated Visual Fault Detection by Combining a Biologically Plausible Model of Visual Attention with Deep Learning

Authors: Frederik Beuth, Tobias Schlosser, Michael Friedrich, Danny Kowerko

Abstract: It is a long-term goal to transfer biological processing principles as well as the power of human recognition into machine vision and engineering systems. One of such principles is visual attention, a smart human concept which focuses processing on a part of a scene. In this contribution, we utilize attention to improve the automatic detection of defect patterns for wafers within the domain of sem… ▽ More It is a long-term goal to transfer biological processing principles as well as the power of human recognition into machine vision and engineering systems. One of such principles is visual attention, a smart human concept which focuses processing on a part of a scene. In this contribution, we utilize attention to improve the automatic detection of defect patterns for wafers within the domain of semiconductor manufacturing. Previous works in the domain have often utilized classical machine learning approaches such as KNNs, SVMs, or MLPs, while a few have already used modern approaches like deep neural networks (DNNs). However, one problem in the domain is that the faults are often very small and have to be detected within a larger size of the chip or even the wafer. Therefore, small structures in the size of pixels have to be detected in a vast amount of image data. One interesting principle of the human brain for solving this problem is visual attention. Hence, we employ here a biologically plausible model of visual attention for automatic visual inspection. We propose a hybrid system of visual attention and a deep neural network. As demonstrated, our system achieves among other decisive advantages an improvement in accuracy from 81% to 92%, and an increase in accuracy for detecting faults from 67% to 88%. Hence, the error rates are reduced from 19% to 8%, and notably from 33% to 12% for detecting a fault in a chip. These results show that attention can greatly improve the performance of visual inspection systems. Furthermore, we conduct a broad evaluation, identifying specific advantages of the biological attention model in this application, and benchmarks standard deep learning approaches as an alternative with and without attention. This work is an extended arXiv version of the original conference article published in "IECON 2020", which has been extended regarding visual attention. △ Less

Submitted 13 February, 2021; originally announced February 2021.

Comments: This work is an extended arXiv version of the original conference article published in "IECON 2020": https://ieeexplore.ieee.org/abstract/document/9255234 . The work has been extended regarding visual attention

arXiv:2101.00337 [pdf, other]

doi 10.1109/ICIP40778.2020.9190995

Biologically Inspired Hexagonal Deep Learning for Hexagonal Image Generation

Authors: Tobias Schlosser, Frederik Beuth, Danny Kowerko

Abstract: Whereas conventional state-of-the-art image processing systems of recording and output devices almost exclusively utilize square arranged methods, biological models, however, suggest an alternative, evolutionarily-based structure. Inspired by the human visual perception system, hexagonal image processing in the context of machine learning offers a number of key advantages that can benefit both res… ▽ More Whereas conventional state-of-the-art image processing systems of recording and output devices almost exclusively utilize square arranged methods, biological models, however, suggest an alternative, evolutionarily-based structure. Inspired by the human visual perception system, hexagonal image processing in the context of machine learning offers a number of key advantages that can benefit both researchers and users alike. The hexagonal deep learning framework Hexnet leveraged in this contribution serves therefore the generation of hexagonal images by utilizing hexagonal deep neural networks (H-DNN). As the results of our created test environment show, the proposed models can surpass current approaches of conventional image generation. While resulting in a reduction of the models' complexity in the form of trainable parameters, they furthermore allow an increase of test rates in comparison to their square counterparts. △ Less

Submitted 7 June, 2024; v1 submitted 1 January, 2021; originally announced January 2021.

Comments: Accepted for: 2020 27th IEEE International Conference on Image Processing (ICIP). arXiv admin note: text overlap with arXiv:1911.11251

arXiv:1911.11251 [pdf, other]

doi 10.1109/ICMLA.2019.00300

Hexagonal Image Processing in the Context of Machine Learning: Conception of a Biologically Inspired Hexagonal Deep Learning Framework

Authors: Tobias Schlosser, Michael Friedrich, Danny Kowerko

Abstract: Inspired by the human visual perception system, hexagonal image processing in the context of machine learning deals with the development of image processing systems that combine the advantages of evolutionary motivated structures based on biological models. While conventional state-of-the-art image processing systems of recording and output devices almost exclusively utilize square arranged method… ▽ More Inspired by the human visual perception system, hexagonal image processing in the context of machine learning deals with the development of image processing systems that combine the advantages of evolutionary motivated structures based on biological models. While conventional state-of-the-art image processing systems of recording and output devices almost exclusively utilize square arranged methods, their hexagonal counterparts offer a number of key advantages that can benefit both researchers and users. This contribution serves as a general application-oriented approach the synthesis of the therefore designed hexagonal image processing framework, called Hexnet, the processing steps of hexagonal image transformation, and dependent methods. The results of our created test environment show that the realized framework surpasses current approaches of hexagonal image processing systems, while hexagonal artificial neural networks can benefit from the implemented hexagonal architecture. As hexagonal lattice format based deep neural networks, also called H-DNN, can be compared to their square counterparts by transforming classical square lattice based data sets into their hexagonal representation, they can also result in a reduction of trainable parameters as well as result in increased training and test rates. △ Less

Submitted 7 June, 2024; v1 submitted 25 November, 2019; originally announced November 2019.

Comments: Accepted for: 2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA)

arXiv:1911.11250 [pdf, other]

doi 10.1109/ETFA.2019.8869311

A Novel Visual Fault Detection and Classification System for Semiconductor Manufacturing Using Stacked Hybrid Convolutional Neural Networks

Authors: Tobias Schlosser, Frederik Beuth, Michael Friedrich, Danny Kowerko

Abstract: Automated visual inspection in the semiconductor industry aims to detect and classify manufacturing defects utilizing modern image processing techniques. While an earliest possible detection of defect patterns allows quality control and automation of manufacturing chains, manufacturers benefit from an increased yield and reduced manufacturing costs. Since classical image processing systems are lim… ▽ More Automated visual inspection in the semiconductor industry aims to detect and classify manufacturing defects utilizing modern image processing techniques. While an earliest possible detection of defect patterns allows quality control and automation of manufacturing chains, manufacturers benefit from an increased yield and reduced manufacturing costs. Since classical image processing systems are limited in their ability to detect novel defect patterns, and machine learning approaches often involve a tremendous amount of computational effort, this contribution introduces a novel deep neural network based hybrid approach. Unlike classical deep neural networks, a multi-stage system allows the detection and classification of the finest structures in pixel size within high-resolution imagery. Consisting of stacked hybrid convolutional neural networks (SH-CNN) and inspired by current approaches of visual attention, the realized system draws the focus over the level of detail from its structures to more task-relevant areas of interest. The results of our test environment show that the SH-CNN outperforms current approaches of learning-based automated visual inspection, whereas a distinction depending on the level of detail enables the elimination of defect patterns in earlier stages of the manufacturing process. △ Less

Submitted 7 June, 2024; v1 submitted 25 November, 2019; originally announced November 2019.

Comments: Accepted for: 2019 IEEE 24th International Conference on Emerging Technologies and Factory Automation (ETFA)

arXiv:1804.07177 [pdf, other]

Recognizing Birds from Sound - The 2018 BirdCLEF Baseline System

Authors: Stefan Kahl, Thomas Wilhelm-Stein, Holger Klinck, Danny Kowerko, Maximilian Eibl

Abstract: Reliable identification of bird species in recorded audio files would be a transformative tool for researchers, conservation biologists, and birders. In recent years, artificial neural networks have greatly improved the detection quality of machine learning systems for bird species recognition. We present a baseline system using convolutional neural networks. We publish our code base as reference… ▽ More Reliable identification of bird species in recorded audio files would be a transformative tool for researchers, conservation biologists, and birders. In recent years, artificial neural networks have greatly improved the detection quality of machine learning systems for bird species recognition. We present a baseline system using convolutional neural networks. We publish our code base as reference for participants in the 2018 LifeCLEF bird identification task and discuss our experiments and potential improvements. △ Less

Submitted 19 April, 2018; originally announced April 2018.

Comments: The repository and a continuative tutorial can be found here: https://github.com/kahst/BirdCLEF-Baseline

Showing 1–8 of 8 results for author: Kowerko, D