Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 81 results for author: Rosenhahn, B

.
  1. arXiv:2405.19569  [pdf, other

    cs.CV

    Improved Convex Decomposition with Ensembling and Boolean Primitives

    Authors: Vaibhav Vavilala, Florian Kluger, Seemandhar Jain, Bodo Rosenhahn, David Forsyth

    Abstract: Describing a scene in terms of primitives -- geometrically simple shapes that offer a parsimonious but accurate abstraction of structure -- is an established vision problem. This is a good model of a difficult fitting problem: different scenes require different numbers of primitives and primitives interact strongly, but any proposed solution can be evaluated at inference time. The state of the art… ▽ More

    Submitted 9 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: 18 pages, 9 figures, 7 tables

  2. arXiv:2404.06832  [pdf, other

    cs.CV cs.LG

    SplatPose & Detect: Pose-Agnostic 3D Anomaly Detection

    Authors: Mathis Kruse, Marco Rudolph, Dominik Woiwode, Bodo Rosenhahn

    Abstract: Detecting anomalies in images has become a well-explored problem in both academia and industry. State-of-the-art algorithms are able to detect defects in increasingly difficult settings and data modalities. However, most current methods are not suited to address 3D objects captured from differing poses. While solutions using Neural Radiance Fields (NeRFs) have been proposed, they suffer from exces… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Visual Anomaly and Novelty Detection 2.0 Workshop at CVPR 2024

  3. arXiv:2403.15011  [pdf, other

    cs.CV

    Cell Tracking according to Biological Needs -- Strong Mitosis-aware Random-finite Sets Tracker with Aleatoric Uncertainty

    Authors: Timo Kaiser, Maximilian Schier, Bodo Rosenhahn

    Abstract: Cell tracking and segmentation assist biologists in extracting insights from large-scale microscopy time-lapse data. Driven by local accuracy metrics, current tracking approaches often suffer from a lack of long-term consistency. To address this issue, we introduce an uncertainty estimation technique for neural tracking-by-regression frameworks and incorporate it into our novel extended Poisson mu… ▽ More

    Submitted 25 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 23 pages, 10 figures, 5 tables

  4. Personalized 3D Human Pose and Shape Refinement

    Authors: Tom Wehrbein, Bodo Rosenhahn, Iain Matthews, Carsten Stoll

    Abstract: Recently, regression-based methods have dominated the field of 3D human pose and shape estimation. Despite their promising results, a common issue is the misalignment between predictions and image observations, often caused by minor joint rotation errors that accumulate along the kinematic chain. To address this issue, we propose to construct dense correspondences between initial human model estim… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted to 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

    Journal ref: 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)

  5. arXiv:2403.10780  [pdf, other

    cs.CV cs.AI

    Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation

    Authors: Mariia Khan, Yue Qiu, Yuren Cong, Jumana Abu-Khalaf, David Suter, Bodo Rosenhahn

    Abstract: Multi-class multi-instance segmentation is the task of identifying masks for multiple object classes and multiple instances of the same class within an image. The foundational Segment Anything Model (SAM) is designed for promptable multi-class multi-instance segmentation but tends to output part or sub-part masks in the "everything" mode for various real-world applications. Whole object segmentati… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  6. arXiv:2403.10452  [pdf, other

    cs.CV

    Robust Shape Fitting for 3D Scene Abstraction

    Authors: Florian Kluger, Eric Brachmann, Michael Ying Yang, Bodo Rosenhahn

    Abstract: Humans perceive and construct the world as an arrangement of simple parametric models. In particular, we can often describe man-made environments using volumetric primitives such as cuboids or cylinders. Inferring these primitives is important for attaining high-level, abstract scene descriptions. Previous approaches for primitive-based abstraction estimate shape parameters directly and are only a… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted for publication in Transactions on Pattern Analysis and Machine Intelligence (PAMI). arXiv admin note: substantial text overlap with arXiv:2105.02047

  7. arXiv:2402.03136  [pdf, other

    cs.AI cs.GT

    Mastering Zero-Shot Interactions in Cooperative and Competitive Simultaneous Games

    Authors: Yannik Mahlau, Frederik Schubert, Bodo Rosenhahn

    Abstract: The combination of self-play and planning has achieved great successes in sequential games, for instance in Chess and Go. However, adapting algorithms such as AlphaZero to simultaneous games poses a new challenge. In these games, missing information about concurrent actions of other agents is a limiting factor as they may select different Nash equilibria or do not play optimally at all. Thus, it i… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  8. arXiv:2402.02866  [pdf, other

    quant-ph cs.LG

    Quantum Normalizing Flows for Anomaly Detection

    Authors: Bodo Rosenhahn, Christoph Hirche

    Abstract: A Normalizing Flow computes a bijective mapping from an arbitrary distribution to a predefined (e.g. normal) distribution. Such a flow can be used to address different tasks, e.g. anomaly detection, once such a mapping has been learned. In this work we introduce Normalizing Flows for Quantum architectures, describe how to model and optimize such a flow and evaluate our method on example datasets.… ▽ More

    Submitted 19 April, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: v2: 13 pages, 7 figures

  9. arXiv:2401.14919  [pdf, other

    cs.CV

    PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample Consensus

    Authors: Florian Kluger, Bodo Rosenhahn

    Abstract: We present a real-time method for robust estimation of multiple instances of geometric models from noisy data. Geometric models such as vanishing points, planar homographies or fundamental matrices are essential for 3D scene analysis. Previous approaches discover distinct model instances in an iterative manner, thus limiting their potential for speedup via parallel computation. In contrast, our me… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: AAAI 2024

  10. arXiv:2312.13839  [pdf, other

    cs.CV cs.LG

    Q-SENN: Quantized Self-Explaining Neural Networks

    Authors: Thomas Norrenbrock, Marco Rudolph, Bodo Rosenhahn

    Abstract: Explanations in Computer Vision are often desired, but most Deep Neural Networks can only provide saliency maps with questionable faithfulness. Self-Explaining Neural Networks (SENN) extract interpretable concepts with fidelity, diversity, and grounding to combine them linearly for decision-making. While they can explain what was recognized, initial realizations lack accuracy and general applicabi… ▽ More

    Submitted 16 February, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024, SRRAI

  11. arXiv:2311.04765  [pdf, other

    cs.RO cs.AI cs.LG

    The voraus-AD Dataset for Anomaly Detection in Robot Applications

    Authors: Jan Thieß Brockmann, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt

    Abstract: During the operation of industrial robots, unusual events may endanger the safety of humans and the quality of production. When collecting data to detect such cases, it is not ensured that data from all potentially occurring errors is included as unforeseeable events may happen over time. Therefore, anomaly detection (AD) delivers a practical solution, using only normal data to learn to detect unu… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 14 pages, 14 figures, accepted to Transactions on Robotics

  12. arXiv:2310.05922  [pdf, other

    cs.CV

    FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

    Authors: Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He

    Abstract: Text-to-video editing aims to edit the visual appearance of a source video conditional on textual prompts. A major challenge in this task is to ensure that all frames in the edited video are visually consistent. Most recent works apply advanced text-to-image diffusion models to this task by inflating 2D spatial attention in the U-Net into spatio-temporal attention. Although temporal context can be… ▽ More

    Submitted 29 February, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted by ICLR2024. Project page: https://flatten-video-editing.github.io/

  13. arXiv:2308.07163  [pdf, other

    cs.CV

    HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

    Authors: Patrick Glandorf, Timo Kaiser, Bodo Rosenhahn

    Abstract: Sparse neural networks are a key factor in developing resource-efficient machine learning applications. We propose the novel and powerful sparse learning method Adaptive Regularized Training (ART) to compress dense into sparse networks. Instead of the commonly used binary mask during training to reduce the number of model weights, we inherently shrink weights close to zero in an iterative manner w… ▽ More

    Submitted 16 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: ICCV'23 Workshops

  14. Markerless human pose estimation for biomedical applications: a survey

    Authors: Andrea Avogaro, Federico Cunico, Bodo Rosenhahn, Francesco Setti

    Abstract: Markerless Human Pose Estimation (HPE) proved its potential to support decision making and assessment in many fields of application. HPE is often preferred to traditional marker-based Motion Capture systems due to the ease of setup, portability, and affordable cost of the technology. However, the exploitation of HPE in biomedical applications is still under investigation. This review aims to provi… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Journal ref: Frontiers in Computer Science 5, (2023): 1153160

  15. arXiv:2307.07353  [pdf, other

    quant-ph

    Monte Carlo Graph Search for Quantum Circuit Optimization

    Authors: Bodo Rosenhahn, Tobias J. Osborne

    Abstract: The building blocks of quantum algorithms and software are quantum gates, with the appropriate combination of quantum gates leading to a desired quantum circuit. Deep expert knowledge is necessary to discover effective combinations of quantum gates to achieve a desired quantum algorithm for solving a specific task. This is especially challenging for quantum machine learning and signal processing.… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

    Comments: 14 pages, 9 figures

  16. arXiv:2306.02954  [pdf, other

    cs.CV

    Color-aware Deep Temporal Backdrop Duplex Matting System

    Authors: Hendrik Hachmann, Bodo Rosenhahn

    Abstract: Deep learning-based alpha matting showed tremendous improvements in recent years, yet, feature film production studios still rely on classical chroma keying including costly post-production steps. This perceived discrepancy can be explained by some missing links necessary for production which are currently not adequately addressed in the alpha matting community, in particular foreground color esti… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  17. arXiv:2306.02930  [pdf, other

    cs.CV

    Human Spine Motion Capture using Perforated Kinesiology Tape

    Authors: Hendrik Hachmann, Bodo Rosenhahn

    Abstract: In this work, we present a marker-based multi-view spine tracking method that is specifically adjusted to the requirements for movements in sports. A maximal focus is on the accurate detection of markers and fast usage of the system. For this task, we take advantage of the prior knowledge of the arrangement of dots in perforated kinesiology tape. We detect the tape and its dots using a Mask R-CNN… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  18. arXiv:2304.13428  [pdf, other

    cs.CV

    Compensation Learning in Semantic Segmentation

    Authors: Timo Kaiser, Christoph Reinders, Bodo Rosenhahn

    Abstract: Label noise and ambiguities between similar classes are challenging problems in developing new models and annotating new data for semantic segmentation. In this paper, we propose Compensation Learning in Semantic Segmentation, a framework to identify and compensate ambiguities as well as label noise. More specifically, we add a ground truth depending and globally learned bias to the classification… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: 8 pages, 6 figures, 4 tables, Vision Datasets Understanding Workshop on CVPR23

  19. arXiv:2304.00590  [pdf, other

    cs.CV

    SPAN: Learning Similarity between Scene Graphs and Images with Transformers

    Authors: Yuren Cong, Wentong Liao, Bodo Rosenhahn, Michael Ying Yang

    Abstract: Learning similarity between scene graphs and images aims to estimate a similarity score given a scene graph and an image. There is currently no research dedicated to this task, although it is critical for scene graph generation and downstream applications. Scene graph generation is conventionally evaluated by Recall$@K$ and mean Recall$@K$, which measure the ratio of predicted triplets that appear… ▽ More

    Submitted 20 May, 2024; v1 submitted 2 April, 2023; originally announced April 2023.

  20. arXiv:2303.13166  [pdf, other

    cs.CV cs.LG

    Take 5: Interpretable Image Classification with a Handful of Features

    Authors: Thomas Norrenbrock, Marco Rudolph, Bodo Rosenhahn

    Abstract: Deep Neural Networks use thousands of mostly incomprehensible features to identify a single class, a decision no human can follow. We propose an interpretable sparse and low dimensional final decision layer in a deep neural network with measurable aspects of interpretability and demonstrate it on fine-grained image classification. We argue that a human can only understand the decision of a machine… ▽ More

    Submitted 5 August, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    ACM Class: I.2.4

    Journal ref: Progress and Challenges in Building Trustworthy Embodied AI @NeurIPS, December 2022

  21. arXiv:2301.01413  [pdf, other

    cs.CV

    Attribute-Centric Compositional Text-to-Image Generation

    Authors: Yuren Cong, Martin Renqiang Min, Li Erran Li, Bodo Rosenhahn, Michael Ying Yang

    Abstract: Despite the recent impressive breakthroughs in text-to-image generation, generative models have difficulty in capturing the data distribution of underrepresented attribute compositions while over-memorizing overrepresented attribute compositions, which raises public concerns about their robustness and fairness. To tackle this challenge, we propose ACTIG, an attribute-centric compositional text-to-… ▽ More

    Submitted 3 January, 2023; originally announced January 2023.

  22. arXiv:2211.11607  [pdf

    cs.CV cs.LG

    Semantic Segmentation for Fully Automated Macrofouling Analysis on Coatings after Field Exposure

    Authors: Lutz M. K. Krause, Emily Manderfeld, Patricia Gnutt, Louisa Vogler, Ann Wassick, Kailey Richard, Marco Rudolph, Kelli Z. Hunsucker, Geoffrey W. Swain, Bodo Rosenhahn, Axel Rosenhahn

    Abstract: Biofouling is a major challenge for sustainable shipping, filter membranes, heat exchangers, and medical devices. The development of fouling-resistant coatings requires the evaluation of their effectiveness. Such an evaluation is usually based on the assessment of fouling progression after different exposure times to the target medium (e.g., salt water). The manual assessment of macrofouling requi… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: 33 pages, 10 figures

  23. arXiv:2211.11355  [pdf, other

    cs.CV

    Blind Knowledge Distillation for Robust Image Classification

    Authors: Timo Kaiser, Lukas Ehmann, Christoph Reinders, Bodo Rosenhahn

    Abstract: Optimizing neural networks with noisy labels is a challenging task, especially if the label set contains real-world noise. Networks tend to generalize to reasonable patterns in the early training stages and overfit to specific details of noisy samples in the latter ones. We introduce Blind Knowledge Distillation - a novel teacher-student approach for learning with noisy labels by masking the groun… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: 8 pages, 4 figures, 2 tables. Submitted to the 1st Learning and Mining with Noisy Labels Challenge on IJCAI22, see http://competition.noisylabels.com/. Code is available https://github.com/TimoK93/blind_knowledge_distillation

  24. arXiv:2211.06119  [pdf, other

    cs.CV

    SSGVS: Semantic Scene Graph-to-Video Synthesis

    Authors: Yuren Cong, Jinhui Yi, Bodo Rosenhahn, Michael Ying Yang

    Abstract: As a natural extension of the image synthesis task, video synthesis has attracted a lot of interest recently. Many image synthesis works utilize class labels or text as guidance. However, neither labels nor text can provide explicit temporal guidance, such as when an action starts or ends. To overcome this limitation, we introduce semantic video scene graphs as input for video synthesis, as they r… ▽ More

    Submitted 17 November, 2022; v1 submitted 11 November, 2022; originally announced November 2022.

  25. arXiv:2210.07829  [pdf, other

    cs.LG cs.AI cs.CV

    Asymmetric Student-Teacher Networks for Industrial Anomaly Detection

    Authors: Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, Bastian Wandt

    Abstract: Industrial defect detection is commonly addressed with anomaly detection (AD) methods where no or only incomplete data of potentially occurring defects is available. This work discovers previously unknown problems of student-teacher approaches for AD and proposes a solution, where two neural networks are trained to produce the same output for the defect-free training examples. The core assumption… ▽ More

    Submitted 18 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: accepted to WACV 2023

  26. arXiv:2209.08309  [pdf, other

    cs.LG cs.AI

    AdaCC: Cumulative Cost-Sensitive Boosting for Imbalanced Classification

    Authors: Vasileios Iosifidis, Symeon Papadopoulos, Bodo Rosenhahn, Eirini Ntoutsi

    Abstract: Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

    Comments: 30 pages

  27. arXiv:2205.11357  [pdf, other

    cs.LG cs.RO

    POLTER: Policy Trajectory Ensemble Regularization for Unsupervised Reinforcement Learning

    Authors: Frederik Schubert, Carolin Benjamins, Sebastian Döhler, Bodo Rosenhahn, Marius Lindauer

    Abstract: The goal of Unsupervised Reinforcement Learning (URL) is to find a reward-agnostic prior policy on a task domain, such that the sample-efficiency on supervised downstream tasks is improved. Although agents initialized with such a prior policy can achieve a significantly higher reward with fewer samples when finetuned on the downstream task, it is still an open question how an optimal pretrained pr… ▽ More

    Submitted 15 December, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

    Journal ref: Transactions on Machine Learning Research, 2023

  28. arXiv:2202.11616  [pdf, other

    cs.CV

    ChimeraMix: Image Classification on Small Datasets via Masked Feature Mixing

    Authors: Christoph Reinders, Frederik Schubert, Bodo Rosenhahn

    Abstract: Deep convolutional neural networks require large amounts of labeled data samples. For many real-world applications, this is a major limitation which is commonly treated by augmentation methods. In this work, we address the problem of learning deep neural networks on small datasets. Our proposed architecture called ChimeraMix learns a data augmentation by generating compositions of instances. The g… ▽ More

    Submitted 29 July, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

    Comments: Published at IJCAI-22

  29. arXiv:2202.04500  [pdf, other

    cs.LG

    Contextualize Me -- The Case for Context in Reinforcement Learning

    Authors: Carolin Benjamins, Theresa Eimer, Frederik Schubert, Aditya Mohan, Sebastian Döhler, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer

    Abstract: While Reinforcement Learning ( RL) has made great strides towards solving increasingly complicated problems, many algorithms are still brittle to even slight environmental changes. Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner, thereby enabling flexible, precise and interpretable task specification and generation. Our goal is to show how… ▽ More

    Submitted 2 June, 2023; v1 submitted 9 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2110.02102

  30. arXiv:2201.11460  [pdf, other

    cs.CV

    RelTR: Relation Transformer for Scene Graph Generation

    Authors: Yuren Cong, Michael Ying Yang, Bodo Rosenhahn

    Abstract: Different objects in the same scene are more or less related to each other, but only a limited number of these relationships are noteworthy. Inspired by DETR, which excels in object detection, we view scene graph generation as a set prediction problem and propose an end-to-end scene graph generation model RelTR which has an encoder-decoder architecture. The encoder reasons about the visual feature… ▽ More

    Submitted 14 April, 2023; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence

  31. arXiv:2111.11892  [pdf, other

    cs.CV

    LMGP: Lifted Multicut Meets Geometry Projections for Multi-Camera Multi-Object Tracking

    Authors: Duy M. H. Nguyen, Roberto Henschel, Bodo Rosenhahn, Daniel Sonntag, Paul Swoboda

    Abstract: Multi-Camera Multi-Object Tracking is currently drawing attention in the computer vision field due to its superior performance in real-world applications such as video surveillance in crowded scenes or in wide spaces. In this work, we propose a mathematically elegant multi-camera multiple object tracking approach based on a spatial-temporal lifted multicut formulation. Our model utilizes state-of-… ▽ More

    Submitted 3 May, 2022; v1 submitted 23 November, 2021; originally announced November 2021.

    Comments: Official version for CVPR 2022

  32. arXiv:2110.02855  [pdf, other

    cs.CV

    Fully Convolutional Cross-Scale-Flows for Image-based Defect Detection

    Authors: Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, Bastian Wandt

    Abstract: In industrial manufacturing processes, errors frequently occur at unpredictable times and in unknown manifestations. We tackle the problem of automatic defect detection without requiring any image samples of defective parts. Recent works model the distribution of defect-free image data, using either strong statistical priors or overly simplified data representations. In contrast, our approach hand… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  33. arXiv:2110.02102  [pdf, other

    cs.LG

    CARL: A Benchmark for Contextual and Adaptive Reinforcement Learning

    Authors: Carolin Benjamins, Theresa Eimer, Frederik Schubert, André Biedenkapp, Bodo Rosenhahn, Frank Hutter, Marius Lindauer

    Abstract: While Reinforcement Learning has made great strides towards solving ever more complicated tasks, many algorithms are still brittle to even slight changes in their environment. This is a limiting factor for real-world applications of RL. Although the research community continuously aims at improving both robustness and generalization of RL algorithms, unfortunately it still lacks an open-source set… ▽ More

    Submitted 11 October, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

    Journal ref: Workshop on Ecological Theory of Reinforcement Learning, NeurIPS 2021

  34. arXiv:2108.10606  [pdf, other

    cs.CV cs.DM

    Making Higher Order MOT Scalable: An Efficient Approximate Solver for Lifted Disjoint Paths

    Authors: Andrea Hornakova, Timo Kaiser, Paul Swoboda, Michal Rolinek, Bodo Rosenhahn, Roberto Henschel

    Abstract: We present an efficient approximate message passing solver for the lifted disjoint paths problem (LDP), a natural but NP-hard model for multiple object tracking (MOT). Our tracker scales to very large instances that come from long and crowded MOT sequences. Our approximate solver enables us to process the MOT15/16/17 benchmarks without sacrificing solution quality and allows for solving MOT20, whi… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

    Comments: ICCV 2021. Short version published at CVPR 2021 RVSU workshop https://omnomnom.vision.rwth-aachen.de/data/RobMOTS/workshop/papers/9/CameraReady/paper_V3.pdf . Implementation available at https://github.com/LPMP/LPMP and https://github.com/TimoK93/ApLift

  35. arXiv:2108.02874  [pdf, other

    cs.CV

    Disentangled Lifespan Face Synthesis

    Authors: Sen He, Wentong Liao, Michael Ying Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

    Abstract: A lifespan face synthesis (LFS) model aims to generate a set of photo-realistic face images of a person's whole life, given only one snapshot as reference. The generated face image given a target age code is expected to be age-sensitive reflected by bio-plausible transformations of shape and texture, while being identity preserving. This is extremely challenging because the shape and texture chara… ▽ More

    Submitted 13 August, 2021; v1 submitted 5 August, 2021; originally announced August 2021.

    Comments: To appear in ICCV 2021

  36. arXiv:2107.13788  [pdf, other

    cs.CV

    Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows

    Authors: Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt

    Abstract: 3D human pose estimation from monocular images is a highly ill-posed problem due to depth ambiguities and occlusions. Nonetheless, most existing works ignore these ambiguities and only estimate a single solution. In contrast, we generate a diverse set of hypotheses that represents the full posterior distribution of feasible 3D poses. To this end, we propose a normalizing flow based method that exp… ▽ More

    Submitted 2 August, 2021; v1 submitted 29 July, 2021; originally announced July 2021.

    Comments: Accepted to ICCV 2021

  37. arXiv:2107.12309  [pdf, other

    cs.CV

    Spatial-Temporal Transformer for Dynamic Scene Graph Generation

    Authors: Yuren Cong, Wentong Liao, Hanno Ackermann, Bodo Rosenhahn, Michael Ying Yang

    Abstract: Dynamic scene graph generation aims at generating a scene graph of the given video. Compared to the task of scene graph generation from images, it is more challenging because of the dynamic relationships between objects and the temporal dependencies between frames allowing for a richer semantic interpretation. In this paper, we propose Spatial-temporal Transformer (STTran), a neural network that c… ▽ More

    Submitted 8 August, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: accepted by ICCV 2021

  38. arXiv:2106.10155  [pdf, other

    cs.LG cs.CV cs.NE

    World-GAN: a Generative Model for Minecraft Worlds

    Authors: Maren Awiszus, Frederik Schubert, Bodo Rosenhahn

    Abstract: This work introduces World-GAN, the first method to perform data-driven Procedural Content Generation via Machine Learning in Minecraft from a single example. Based on a 3D Generative Adversarial Network (GAN) architecture, we are able to create arbitrarily sized world snippets from a given sample. We evaluate our approach on creations from the community as well as structures generated with the Mi… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

    Comments: 8 pages, 8 figures, IEEE Conference on Games (CoG) 2021

  39. arXiv:2106.06317  [pdf, other

    cs.LG

    Automatic Risk Adaptation in Distributional Reinforcement Learning

    Authors: Frederik Schubert, Theresa Eimer, Bodo Rosenhahn, Marius Lindauer

    Abstract: The use of Reinforcement Learning (RL) agents in practical applications requires the consideration of suboptimal outcomes, depending on the familiarity of the agent with its environment. This is especially important in safety-critical environments, where errors can lead to high costs or damage. In distributional RL, the risk-sensitivity can be controlled via different distortion measures of the es… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Journal ref: Reinforcement Learning for Real Life Workshop, ICML 2021

  40. arXiv:2105.02047  [pdf, other

    cs.CV

    Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images

    Authors: Florian Kluger, Hanno Ackermann, Eric Brachmann, Michael Ying Yang, Bodo Rosenhahn

    Abstract: Humans perceive and construct the surrounding world as an arrangement of simple parametric models. In particular, man-made environments commonly consist of volumetric primitives such as cuboids or cylinders. Inferring these primitives is an important step to attain high-level, abstract scene descriptions. Previous approaches directly estimate shape parameters from a 2D or 3D input, and are only ab… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: CVPR 2021

  41. arXiv:2104.00567  [pdf, other

    cs.CV cs.LG

    Text to Image Generation with Semantic-Spatial Aware GAN

    Authors: Kai Hu, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn

    Abstract: Text-to-image synthesis (T2I) aims to generate photo-realistic images which are semantically consistent with the text descriptions. Existing methods are usually built upon conditional generative adversarial networks (GANs) and initialize an image from noise with sentence embedding, and then refine the features with fine-grained word embedding iteratively. A close inspection of their generated imag… ▽ More

    Submitted 24 March, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: arXiv admin note: text overlap with arXiv:1711.10485 by other authors

  42. arXiv:2103.11897  [pdf, other

    cs.CV

    Context-Aware Layout to Image Generation with Enhanced Object Appearance

    Authors: Sen He, Wentong Liao, Michael Ying Yang, Yongxin Yang, Yi-Zhe Song, Bodo Rosenhahn, Tao Xiang

    Abstract: A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff), conditioned on a given layout. Built upon the recent advances in generative adversarial networks (GANs), existing L2I models have made great progress. However, a close inspection of their generated images reveals two major limitations: (1) the objec… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

    Comments: CVPR 2021

  43. arXiv:2103.10434  [pdf, other

    cs.CV eess.IV q-bio.QM

    Localization of Cochlear Implant Electrodes from Cone Beam Computed Tomography using Particle Belief Propagation

    Authors: Hendrik Hachmann, Benjamin Krüger, Bodo Rosenhahn, Waldo Nogueira

    Abstract: Cochlear implants (CIs) are implantable medical devices that can restore the hearing sense of people suffering from profound hearing loss. The CI uses a set of electrode contacts placed inside the cochlea to stimulate the auditory nerve with current pulses. The exact location of these electrodes may be an important parameter to improve and predict the performance with these devices. Currently the… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    ACM Class: I.4.0

  44. arXiv:2011.14679  [pdf, other

    cs.CV

    CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild

    Authors: Bastian Wandt, Marco Rudolph, Petrissa Zell, Helge Rhodin, Bodo Rosenhahn

    Abstract: Human pose estimation from single images is a challenging problem in computer vision that requires large amounts of labeled training data to be solved accurately. Unfortunately, for many human activities (\eg outdoor sports) such training data does not exist and is hard or even impossible to acquire with traditional motion capture systems. We propose a self-supervised approach that learns a single… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  45. arXiv:2010.16267  [pdf, other

    cs.CV cs.MA

    Exploring Dynamic Context for Multi-path Trajectory Prediction

    Authors: Hao Cheng, Wentong Liao, Xuejiao Tang, Michael Ying Yang, Monika Sester, Bodo Rosenhahn

    Abstract: To accurately predict future positions of different agents in traffic scenarios is crucial for safely deploying intelligent autonomous systems in the real-world environment. However, it remains a challenge due to the behavior of a target agent being affected by other agents dynamically and there being more than one socially possible paths the agent could take. In this paper, we propose a novel fra… ▽ More

    Submitted 24 March, 2021; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: accpeted by ICRA 2021, code available

  46. arXiv:2008.12577  [pdf, other

    cs.CV cs.LG eess.IV

    Same Same But DifferNet: Semi-Supervised Defect Detection with Normalizing Flows

    Authors: Marco Rudolph, Bastian Wandt, Bodo Rosenhahn

    Abstract: The detection of manufacturing errors is crucial in fabrication processes to ensure product quality and safety standards. Since many defects occur very rarely and their characteristics are mostly unknown a priori, their detection is still an open research question. To this end, we propose DifferNet: It leverages the descriptiveness of features extracted by convolutional neural networks to estimate… ▽ More

    Submitted 28 August, 2020; originally announced August 2020.

  47. arXiv:2008.01531  [pdf, other

    cs.LG cs.NE stat.ML

    TOAD-GAN: Coherent Style Level Generation from a Single Example

    Authors: Maren Awiszus, Frederik Schubert, Bodo Rosenhahn

    Abstract: In this work, we present TOAD-GAN (Token-based One-shot Arbitrary Dimension Generative Adversarial Network), a novel Procedural Content Generation (PCG) algorithm that generates token-based video game levels. TOAD-GAN follows the SinGAN architecture and can be trained using only one example. We demonstrate its application for Super Mario Bros. levels and are able to generate new levels of similar… ▽ More

    Submitted 4 August, 2020; originally announced August 2020.

    Comments: 7 pages, 7 figures. AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 2020

  48. arXiv:2007.08969  [pdf, other

    cs.CV

    Weakly-supervised Learning of Human Dynamics

    Authors: Petrissa Zell, Bodo Rosenhahn, Bastian Wandt

    Abstract: This paper proposes a weakly-supervised learning framework for dynamics estimation from human motion. Although there are many solutions to capture pure human motion readily available, their data is not sufficient to analyze quality and efficiency of movements. Instead, the forces and moments driving human motion (the dynamics) need to be considered. Since recording dynamics is a laborious task tha… ▽ More

    Submitted 23 April, 2021; v1 submitted 17 July, 2020; originally announced July 2020.

  49. arXiv:2006.14550  [pdf, other

    cs.CV cs.DM

    Lifted Disjoint Paths with Application in Multiple Object Tracking

    Authors: Andrea Hornakova, Roberto Henschel, Bodo Rosenhahn, Paul Swoboda

    Abstract: We present an extension to the disjoint paths problem in which additional \emph{lifted} edges are introduced to provide path connectivity priors. We call the resulting optimization problem the lifted disjoint paths problem. We show that this problem is NP-hard by reduction from integer multicommodity flow and 3-SAT. To enable practical global optimization, we propose several classes of linear ineq… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Comments: ICML 2020, Codebase available at https://github.com/AndreaHor/LifT_Solver

  50. arXiv:2006.08264  [pdf, other

    cs.CV

    AMENet: Attentive Maps Encoder Network for Trajectory Prediction

    Authors: Hao Cheng, Wentong Liao, Michael Ying Yang, Bodo Rosenhahn, Monika Sester

    Abstract: Trajectory prediction is critical for applications of planning safe future movements and remains challenging even for the next few seconds in urban mixed traffic. How an agent moves is affected by the various behaviors of its neighboring agents in different environments. To predict movements, we propose an end-to-end generative model named Attentive Maps Encoder Network (AMENet) that encodes the a… ▽ More

    Submitted 13 January, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: Accepted by ISPRS Journal of Photogrammetry and Remote Sensing