Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–28 of 28 results for author: Possegger, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.10534  [pdf, other

    cs.CV cs.AI

    Into the Fog: Evaluating Multiple Object Tracking Robustness

    Authors: Nadezda Kirillova, M. Jehanzeb Mirza, Horst Possegger, Horst Bischof

    Abstract: State-of-the-art (SOTA) trackers have shown remarkable Multiple Object Tracking (MOT) performance when trained and evaluated on current benchmarks. However, these benchmarks primarily consist of clear scenarios, overlooking adverse atmospheric conditions such as fog, haze, smoke and dust. As a result, the robustness of SOTA trackers remains underexplored. To address these limitations, we propose a… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  2. arXiv:2403.14497  [pdf, other

    cs.CV

    MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection

    Authors: Jakub Micorek, Horst Possegger, Dominik Narnhofer, Horst Bischof, Mateusz Kozinski

    Abstract: We propose a novel approach to video anomaly detection: we treat feature vectors extracted from videos as realizations of a random variable with a fixed distribution and model this distribution with a neural network. This lets us estimate the likelihood of test videos and detect video anomalies by thresholding the likelihood estimates. We train our video anomaly detector using a modification of de… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  3. arXiv:2403.11755  [pdf, other

    cs.CV cs.AI cs.LG

    Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

    Authors: M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Sivan Doveh, Jakub Micorek, Mateusz Kozinski, Hilde Kuhene, Horst Possegger

    Abstract: Prompt ensembling of Large Language Model (LLM) generated category-specific prompts has emerged as an effective method to enhance zero-shot recognition ability of Vision-Language Models (VLMs). To obtain these category-specific prompts, the present methods rely on hand-crafting the prompts to the LLMs for generating VLM prompts for the downstream tasks. However, this requires manually composing th… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Project Page (Code and Data): https://jmiemirza.github.io/Meta-Prompting/

  4. arXiv:2402.02533  [pdf, other

    cs.RO

    Identifying and Extracting Pedestrian Behavior in Critical Traffic Situations

    Authors: Martin Schachner, Bernd Schneider, Fabian Weissenbacher, Nadezda Kirillova, Horst Possegger, Horst Bischof, Corina Klug

    Abstract: A better understanding of interactive pedestrian behavior in critical traffic situations is essential for the development of enhanced pedestrian safety systems. Real-world traffic observations play a decisive role in this, since they represent behavior in an unbiased way. In this work, we present an approach of how a subset of very considerable pedestrian-vehicle interactions can be derived from a… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 7 pages, 8 figures, ITSC 2023 accepted

  5. arXiv:2401.08863  [pdf, other

    cs.LG cs.AI cs.CR

    Robust Localization of Key Fob Using Channel Impulse Response of Ultra Wide Band Sensors for Keyless Entry Systems

    Authors: Abhiram Kolli, Filippo Casamassima, Horst Possegger, Horst Bischof

    Abstract: Using neural networks for localization of key fob within and surrounding a car as a security feature for keyless entry is fast emerging. In this paper we study: 1) the performance of pre-computed features of neural networks based UWB (ultra wide band) localization classification forming the baseline of our experiments. 2) Investigate the inherent robustness of various neural networks; therefore, w… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  6. arXiv:2310.20319  [pdf, other

    cs.CV cs.LG

    GACE: Geometry Aware Confidence Enhancement for Black-Box 3D Object Detectors on LiDAR-Data

    Authors: David Schinagl, Georg Krispel, Christian Fruhwirth-Reisinger, Horst Possegger, Horst Bischof

    Abstract: Widely-used LiDAR-based 3D object detectors often neglect fundamental geometric information readily available from the object proposals in their confidence estimation. This is mostly due to architectural design choices, which were often adopted from the 2D image domain, where geometric context is rarely available. In 3D, however, considering the object properties and its surroundings in a holistic… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: ICCV 2023, code is available at https://github.com/dschinagl/gace

  7. arXiv:2309.06809  [pdf, other

    cs.CV

    TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification

    Authors: M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Horst Possegger, Rogerio Feris, Horst Bischof

    Abstract: Vision and Language Models (VLMs), such as CLIP, have enabled visual recognition of a potentially unlimited set of categories described by text prompts. However, for the best visual recognition performance, these models still require tuning to better fit the data distributions of the downstream tasks, in order to overcome the domain shift from the web-based pre-training data. Recently, it has been… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: Code is available at: https://github.com/jmiemirza/TAP

  8. arXiv:2305.18953  [pdf, other

    cs.CV

    Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions

    Authors: Stefan Leitner, M. Jehanzeb Mirza, Wei Lin, Jakub Micorek, Marc Masana, Mateusz Kozinski, Horst Possegger, Horst Bischof

    Abstract: In autonomous driving scenarios, current object detection models show strong performance when tested in clear weather. However, their performance deteriorates significantly when tested in degrading weather conditions. In addition, even when adapted to perform robustly in a sequence of different weather conditions, they are often unable to perform well in all of them and suffer from catastrophic fo… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Intelligent Vehicle Conference (oral presentation)

  9. arXiv:2305.18287  [pdf, other

    cs.CV cs.CL

    LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

    Authors: M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Mateusz Kozinski, Horst Possegger, Rogerio Feris, Horst Bischof

    Abstract: Recently, large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification enabling open-vocabulary recognition of potentially unlimited set of categories defined as simple language prompts. However, despite these great advances, the performance of these zeroshot classifiers still falls short of the results of dedicated (closed categ… ▽ More

    Submitted 23 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 (Camera Ready) - Project Page: https://jmiemirza.github.io/LaFTer/

  10. arXiv:2303.08914  [pdf, other

    cs.CV

    MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge

    Authors: Wei Lin, Leonid Karlinsky, Nina Shvetsova, Horst Possegger, Mateusz Kozinski, Rameswar Panda, Rogerio Feris, Hilde Kuehne, Horst Bischof

    Abstract: Large scale Vision-Language (VL) models have shown tremendous success in aligning representations between visual and text modalities. This enables remarkable progress in zero-shot recognition, image generation & editing, and many other exciting tasks. However, VL models tend to over-represent objects while paying much less attention to verbs, and require additional tuning on video data for best ze… ▽ More

    Submitted 22 July, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted at ICCV 2023

  11. arXiv:2303.05166  [pdf, other

    cs.CV

    TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering

    Authors: Wei Lin, Anna Kukleva, Horst Possegger, Hilde Kuehne, Horst Bischof

    Abstract: Temporal action segmentation in untrimmed videos has gained increased attention recently. However, annotating action classes and frame-wise boundaries is extremely time consuming and cost intensive, especially on large-scale datasets. To address this issue, we propose an unsupervised approach for learning action classes from untrimmed video sequences. In particular, we propose a temporal embedding… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: Computer Vision Winter Workshop 2023

  12. arXiv:2212.07207  [pdf, other

    cs.CV

    MAELi: Masked Autoencoder for Large-Scale LiDAR Point Clouds

    Authors: Georg Krispel, David Schinagl, Christian Fruhwirth-Reisinger, Horst Possegger, Horst Bischof

    Abstract: The sensing process of large-scale LiDAR point clouds inevitably causes large blind spots, i.e. regions not visible to the sensor. We demonstrate how these inherent sampling properties can be effectively utilized for self-supervised representation learning by designing a highly effective pre-training framework that considerably reduces the need for tedious 3D annotations to train state-of-the-art… ▽ More

    Submitted 7 December, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: Accepted to WACV 2024, 16 pages

  13. arXiv:2212.02992  [pdf, other

    cs.CV

    Sparse Message Passing Network with Feature Integration for Online Multiple Object Tracking

    Authors: Bisheng Wang, Horst Possegger, Horst Bischof, Guo Cao

    Abstract: Existing Multiple Object Tracking (MOT) methods design complex architectures for better tracking performance. However, without a proper organization of input information, they still fail to perform tracking robustly and suffer from frequent identity switches. In this paper, we propose two novel methods together with a simple online Message Passing Network (MPN) to address these limitations. First,… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: 8 pages, 2 figures

  14. arXiv:2211.15393  [pdf, other

    cs.CV

    Video Test-Time Adaptation for Action Recognition

    Authors: Wei Lin, Muhammad Jehanzeb Mirza, Mateusz Kozinski, Horst Possegger, Hilde Kuehne, Horst Bischof

    Abstract: Although action recognition systems can achieve top performance when evaluated on in-distribution test points, they are vulnerable to unanticipated distribution shifts in test data. However, test-time adaptation of video action recognition models against common distribution shifts has so far not been demonstrated. We propose to address this problem with an approach tailored to spatio-temporal mode… ▽ More

    Submitted 20 March, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: Accepted at CVPR 2023

  15. arXiv:2211.12870  [pdf, other

    cs.CV

    ActMAD: Activation Matching to Align Distributions for Test-Time-Training

    Authors: Muhammad Jehanzeb Mirza, Pol Jané Soneira, Wei Lin, Mateusz Kozinski, Horst Possegger, Horst Bischof

    Abstract: Test-Time-Training (TTT) is an approach to cope with out-of-distribution (OOD) data by adapting a trained model to distribution shifts occurring at test-time. We propose to perform this adaptation via Activation Matching (ActMAD): We analyze activations of the model and align activation statistics of the OOD test data to those of the training data. In contrast to existing methods, which model the… ▽ More

    Submitted 23 March, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: CVPR 2023 - Project Page: https://jmiemirza.github.io/ActMAD/

  16. arXiv:2211.11432  [pdf, other

    cs.CV

    MATE: Masked Autoencoders are Online 3D Test-Time Learners

    Authors: M. Jehanzeb Mirza, Inkyu Shin, Wei Lin, Andreas Schriebl, Kunyang Sun, Jaesung Choe, Horst Possegger, Mateusz Kozinski, In So Kweon, Kun-Jin Yoon, Horst Bischof

    Abstract: Our MATE is the first Test-Time-Training (TTT) method designed for 3D data, which makes deep networks trained for point cloud classification robust to distribution shifts occurring in test data. Like existing TTT methods from the 2D image domain, MATE also leverages test data for adaptation. Its test-time objective is that of a Masked Autoencoder: a large portion of each test point cloud is remove… ▽ More

    Submitted 20 March, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

    Comments: Code is available at this repository: https://github.com/jmiemirza/MATE

  17. arXiv:2211.05854  [pdf, other

    cs.LG cs.AI

    Test-time adversarial detection and robustness for localizing humans using ultra wide band channel impulse responses

    Authors: Abhiram Kolli, Muhammad Jehanzeb Mirza, Horst Possegger, Horst Bischof

    Abstract: Keyless entry systems in cars are adopting neural networks for localizing its operators. Using test-time adversarial defences equip such systems with the ability to defend against adversarial attacks without prior training on adversarial samples. We propose a test-time adversarial example detector which detects the input adversarial example through quantifying the localized intermediate responses… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 5 pages, 4 figures, ICASSP Conference

  18. arXiv:2210.07811  [pdf, other

    cs.CV

    SAILOR: Scaling Anchors via Insights into Latent Object Representation

    Authors: Dušan Malić, Christian Fruhwirth-Reisinger, Horst Possegger, Horst Bischof

    Abstract: LiDAR 3D object detection models are inevitably biased towards their training dataset. The detector clearly exhibits this bias when employed on a target dataset, particularly towards object sizes. However, object sizes vary heavily between domains due to, for instance, different labeling policies or geographical locations. State-of-the-art unsupervised domain adaptation approaches outsource method… ▽ More

    Submitted 17 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: WACV 2023; code is available at https://github.com/malicd/sailor

  19. arXiv:2204.08817  [pdf, other

    cs.CV

    An Efficient Domain-Incremental Learning Approach to Drive in All Weather Conditions

    Authors: M. Jehanzeb Mirza, Marc Masana, Horst Possegger, Horst Bischof

    Abstract: Although deep neural networks enable impressive visual perception performance for autonomous driving, their robustness to varying weather conditions still requires attention. When adapting these models for changed environments, such as different weather conditions, they are prone to forgetting previously learned information. This catastrophic forgetting is typically addressed via incremental learn… ▽ More

    Submitted 21 April, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted to CVPR Workshops - Camera Ready Version

  20. arXiv:2204.06577  [pdf, other

    cs.CV

    OccAM's Laser: Occlusion-based Attribution Maps for 3D Object Detectors on LiDAR Data

    Authors: David Schinagl, Georg Krispel, Horst Possegger, Peter M. Roth, Horst Bischof

    Abstract: While 3D object detection in LiDAR point clouds is well-established in academia and industry, the explainability of these models is a largely unexplored field. In this paper, we propose a method to generate attribution maps for the detected objects in order to better understand the behavior of such models. These maps indicate the importance of each 3D point in predicting the specific objects. Our… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: CVPR 2022, code is available at https://github.com/dschinagl/occam

  21. arXiv:2203.16244  [pdf, other

    cs.CV

    CycDA: Unsupervised Cycle Domain Adaptation from Image to Video

    Authors: Wei Lin, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne, Horst Bischof

    Abstract: Although action recognition has achieved impressive results over recent years, both collection and annotation of video training data are still time-consuming and cost intensive. Therefore, image-to-video adaptation has been proposed to exploit labeling-free web image source for adapting on unlabeled target videos. This poses two major challenges: (1) spatial domain shift between web images and vid… ▽ More

    Submitted 22 March, 2023; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted at ECCV2022. Supplementary included

  22. arXiv:2203.10554  [pdf, other

    cs.CV

    3D Human Pose Estimation Using Möbius Graph Convolutional Networks

    Authors: Niloofar Azizi, Horst Possegger, Emanuele Rodolà, Horst Bischof

    Abstract: 3D human pose estimation is fundamental to understanding human behavior. Recently, promising results have been achieved by graph convolutional networks (GCNs), which achieve state-of-the-art performance and provide rather light-weight architectures. However, a major limitation of GCNs is their inability to encode all the transformations between joints explicitly. To address this issue, we propose… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

  23. arXiv:2112.00463  [pdf, other

    cs.CV

    The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by Normalization

    Authors: M. Jehanzeb Mirza, Jakub Micorek, Horst Possegger, Horst Bischof

    Abstract: Domain adaptation is crucial to adapt a learned model to new scenarios, such as domain shifts or changing data distributions. Current approaches usually require a large amount of labeled or unlabeled data from the shifted domain. This can be a hurdle in fields which require continuous dynamic adaptation or suffer from scarcity of data, e.g. autonomous driving in challenging weather conditions. To… ▽ More

    Submitted 4 April, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: Accepted to CVPR 2022 - Camera Ready Version - Code: https://github.com/jmiemirza/DUA

  24. arXiv:2110.09355  [pdf, other

    cs.CV cs.RO

    FAST3D: Flow-Aware Self-Training for 3D Object Detectors

    Authors: Christian Fruhwirth-Reisinger, Michael Opitz, Horst Possegger, Horst Bischof

    Abstract: In the field of autonomous driving, self-training is widely applied to mitigate distribution shifts in LiDAR-based 3D object detectors. This eliminates the need for expensive, high-quality labels whenever the environment changes (e.g., geographic location, sensor setup, weather condition). State-of-the-art self-training approaches, however, mostly ignore the temporal nature of autonomous driving d… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: Accepted to BMVC 2021

  25. arXiv:1912.08487  [pdf, other

    cs.CV cs.RO

    FuseSeg: LiDAR Point Cloud Segmentation Fusing Multi-Modal Data

    Authors: Georg Krispel, Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof

    Abstract: We introduce a simple yet effective fusion method of LiDAR and RGB data to segment LiDAR point clouds. Utilizing the dense native range representation of a LiDAR sensor and the setup calibration, we establish point correspondences between the two input modalities. Subsequently, we are able to warp and fuse the features from one domain into the other. Therefore, we can jointly exploit information f… ▽ More

    Submitted 19 December, 2019; v1 submitted 18 December, 2019; originally announced December 2019.

    Comments: Accepted for publication in WACV 2020

    ACM Class: I.4.6; I.4.8; I.2.9; I.2.10

  26. arXiv:1805.03511  [pdf, other

    cs.CV

    Deep 2.5D Vehicle Classification with Sparse SfM Depth Prior for Automated Toll Systems

    Authors: Georg Waltner, Michael Maurer, Thomas Holzmann, Patrick Ruprecht, Michael Opitz, Horst Possegger, Friedrich Fraundorfer, Horst Bischof

    Abstract: Automated toll systems rely on proper classification of the passing vehicles. This is especially difficult when the images used for classification only cover parts of the vehicle. To obtain information about the whole vehicle. we reconstruct the vehicle as 3D object and exploit this additional information within a Convolutional Neural Network (CNN). However, when using deep networks for 3D object… ▽ More

    Submitted 11 May, 2018; v1 submitted 9 May, 2018; originally announced May 2018.

    Comments: Submitted to the IEEE International Conference on Intelligent Transportation Systems 2018 (ITSC), 6 pages, 4 figures; changed format in compliance with adapted IEEE template

  27. arXiv:1801.04815  [pdf, other

    cs.CV

    Deep Metric Learning with BIER: Boosting Independent Embeddings Robustly

    Authors: Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof

    Abstract: Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate training this ensemble as an online gradient… ▽ More

    Submitted 15 January, 2018; originally announced January 2018.

    Comments: Extension to our paper BIER: Boosting Independent Embeddings Robustly (ICCV 2017 oral) - submitted to PAMI

  28. arXiv:1609.00129  [pdf, other

    cs.CV

    Grid Loss: Detecting Occluded Faces

    Authors: Michael Opitz, Georg Waltner, Georg Poier, Horst Possegger, Horst Bischof

    Abstract: Detection of partially occluded objects is a challenging computer vision problem. Standard Convolutional Neural Network (CNN) detectors fail if parts of the detection window are occluded, since not every sub-part of the window is discriminative on its own. To address this issue, we propose a novel loss layer for CNNs, named grid loss, which minimizes the error rate on sub-blocks of a convolution l… ▽ More

    Submitted 1 September, 2016; originally announced September 2016.

    Comments: accepted to ECCV 2016