Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 55 results for author: Leutenegger, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.17622  [pdf, other

    cs.RO

    Online Tree Reconstruction and Forest Inventory on a Mobile Robotic System

    Authors: Leonard Freißmuth, Matias Mattamala, Nived Chebrolu, Simon Schaefer, Stefan Leutenegger, Maurice Fallon

    Abstract: Terrestrial laser scanning (TLS) is the standard technique used to create accurate point clouds for digital forest inventories. However, the measurement process is demanding, requiring up to two days per hectare for data collection, significant data storage, as well as resource-heavy post-processing of 3D data. In this work, we present a real-time mapping and analysis system that enables online ge… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  2. arXiv:2403.11370  [pdf, other

    cs.CV cs.RO

    DynamicGlue: Epipolar and Time-Informed Data Association in Dynamic Environments using Graph Neural Networks

    Authors: Theresa Huber, Simon Schaefer, Stefan Leutenegger

    Abstract: The assumption of a static environment is common in many geometric computer vision tasks like SLAM but limits their applicability in highly dynamic scenes. Since these tasks rely on identifying point correspondences between input images within the static part of the environment, we propose a graph neural network-based sparse feature matching network designed to perform robust matching under challe… ▽ More

    Submitted 1 July, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

  3. arXiv:2403.09596  [pdf, other

    cs.RO

    Scalable Autonomous Drone Flight in the Forest with Visual-Inertial SLAM and Dense Submaps Built without LiDAR

    Authors: Sebastián Barbas Laina, Simon Boche, Sotiris Papatheodorou, Dimos Tzoumanikas, Simon Schaefer, Hanzhi Chen, Stefan Leutenegger

    Abstract: Forestry constitutes a key element for a sustainable future, while it is supremely challenging to introduce digital processes to improve efficiency. The main limitation is the difficulty of obtaining accurate maps at high temporal and spatial resolution as a basis for informed forestry decision-making, due to the vast area forests extend over and the sheer number of trees. To address this challeng… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 8 pages, 7 figures

  4. arXiv:2403.04331  [pdf, other

    cs.RO

    Control-Barrier-Aided Teleoperation with Visual-Inertial SLAM for Safe MAV Navigation in Complex Environments

    Authors: Siqi Zhou, Sotiris Papatheodorou, Stefan Leutenegger, Angela P. Schoellig

    Abstract: In this paper, we consider a Micro Aerial Vehicle (MAV) system teleoperated by a non-expert and introduce a perceptive safety filter that leverages Control Barrier Functions (CBFs) in conjunction with Visual-Inertial Simultaneous Localization and Mapping (VI-SLAM) and dense 3D occupancy mapping to guarantee safe navigation in complex and unstructured environments. Our system relies solely on onboa… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2024, 7 pages, 7 figures, supplementary video is available at https://youtu.be/rCxbWY4PIfQ?si=DC-9mg7g1WooNdaV

  5. arXiv:2403.02280  [pdf, other

    cs.RO

    Tightly-Coupled LiDAR-Visual-Inertial SLAM and Large-Scale Volumetric Occupancy Mapping

    Authors: Simon Boche, Sebastián Barbas Laina, Stefan Leutenegger

    Abstract: Autonomous navigation is one of the key requirements for every potential application of mobile robots in the real-world. Besides high-accuracy state estimation, a suitable and globally consistent representation of the 3D environment is indispensable. We present a fully tightly-coupled LiDAR-Visual-Inertial SLAM system and 3D mapping framework applying local submapping strategies to achieve scalabi… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: IEEE International Conference on Robotics and Automation (ICRA) 2024

  6. arXiv:2402.05644  [pdf, other

    cs.RO cs.CV

    FuncGrasp: Learning Object-Centric Neural Grasp Functions from Single Annotated Example Object

    Authors: Hanzhi Chen, Binbin Xu, Stefan Leutenegger

    Abstract: We present FuncGrasp, a framework that can infer dense yet reliable grasp configurations for unseen objects using one annotated object and single-view RGB-D observation via categorical priors. Unlike previous works that only transfer a set of grasp poses, FuncGrasp aims to transfer infinite configurations parameterized by an object-centric continuous grasp function across varying instances. To eas… ▽ More

    Submitted 22 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted to ICRA 2024

  7. arXiv:2312.13471  [pdf, other

    cs.CV

    NeRF-VO: Real-Time Sparse Visual Odometry with Neural Radiance Fields

    Authors: Jens Naumann, Binbin Xu, Stefan Leutenegger, Xingxing Zuo

    Abstract: We introduce a novel monocular visual odometry (VO) system, NeRF-VO, that integrates learning-based sparse visual odometry for low-latency camera tracking and a neural radiance scene representation for fine-detailed dense reconstruction and novel view synthesis. Our system initializes camera poses using sparse visual odometry and obtains view-dependent dense geometry priors from a monocular predic… ▽ More

    Submitted 16 July, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Project page: https://xingxingzuo.github.io/nerfvo/

    Journal ref: IEEE Robotics and Automation Letters (RA-L), 2024

  8. arXiv:2312.05247  [pdf, other

    cs.CV

    Dynamic LiDAR Re-simulation using Compositional Neural Fields

    Authors: Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger, Or Litany, Konrad Schindler, Shengyu Huang

    Abstract: We introduce DyNFL, a novel neural field-based approach for high-fidelity re-simulation of LiDAR scans in dynamic driving scenes. DyNFL processes LiDAR measurements from dynamic environments, accompanied by bounding boxes of moving objects, to construct an editable neural field. This field, comprising separately reconstructed static background and dynamic objects, allows users to modify viewpoints… ▽ More

    Submitted 3 April, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Project page: https://shengyuh.github.io/dynfl

  9. arXiv:2311.18610  [pdf, other

    cs.CV

    DiffCAD: Weakly-Supervised Probabilistic CAD Model Retrieval and Alignment from an RGB Image

    Authors: Daoyi Gao, Dávid Rozenberszki, Stefan Leutenegger, Angela Dai

    Abstract: Perceiving 3D structures from RGB images based on CAD model primitives can enable an effective, efficient 3D object-based representation of scenes. However, current approaches rely on supervision from expensive annotations of CAD models associated with real images, and encounter challenges due to the inherent ambiguities in the task -- both in depth-scale ambiguity in monocular perception, as well… ▽ More

    Submitted 6 June, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: SIGGRAPH 2024, Project page: https://daoyig.github.io/DiffCAD/

  10. arXiv:2311.02510  [pdf, other

    cs.RO cs.CV

    Anthropomorphic Grasping with Neural Object Shape Completion

    Authors: Diego Hidalgo-Carvajal, Hanzhi Chen, Gemma C. Bettelani, Jaesug Jung, Melissa Zavaglia, Laura Busse, Abdeldjallil Naceri, Stefan Leutenegger, Sami Haddadin

    Abstract: The progressive prevalence of robots in human-suited environments has given rise to a myriad of object manipulation techniques, in which dexterity plays a paramount role. It is well-established that humans exhibit extraordinary dexterity when handling objects. Such dexterity seems to derive from a robust understanding of object properties (such as weight, size, and shape), as well as a remarkable… ▽ More

    Submitted 9 November, 2023; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: Accepted to RA-L 2023

  11. arXiv:2310.09982  [pdf, other

    cs.CV

    AP$n$P: A Less-constrained P$n$P Solver for Pose Estimation with Unknown Anisotropic Scaling or Focal Lengths

    Authors: Jiaxin Wei, Stefan Leutenegger, Laurent Kneip

    Abstract: Perspective-$n$-Point (P$n$P) stands as a fundamental algorithm for pose estimation in various applications. In this paper, we present a new approach to the P$n$P problem with relaxed constraints, eliminating the need for precise 3D coordinates or complete calibration data. We refer to it as AP$n$P due to its ability to handle unknown anisotropic scaling factors of 3D coordinates or alternatively… ▽ More

    Submitted 9 November, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

  12. arXiv:2309.14514  [pdf, other

    cs.CV

    Accurate and Interactive Visual-Inertial Sensor Calibration with Next-Best-View and Next-Best-Trajectory Suggestion

    Authors: Christopher L. Choi, Binbin Xu, Stefan Leutenegger

    Abstract: Visual-Inertial (VI) sensors are popular in robotics, self-driving vehicles, and augmented and virtual reality applications. In order to use them for any computer vision or state-estimation task, a good calibration is essential. However, collecting informative calibration data in order to render the calibration parameters observable is not trivial for a non-expert. In this work, we introduce a nov… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 8 pages, 11 figures, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023)

  13. arXiv:2309.10369  [pdf, other

    cs.CV cs.RO

    GloPro: Globally-Consistent Uncertainty-Aware 3D Human Pose Estimation & Tracking in the Wild

    Authors: Simon Schaefer, Dorian F. Henning, Stefan Leutenegger

    Abstract: An accurate and uncertainty-aware 3D human body pose estimation is key to enabling truly safe but efficient human-robot interactions. Current uncertainty-aware methods in 3D human pose estimation are limited to predicting the uncertainty of the body posture, while effectively neglecting the body shape and root pose. In this work, we present GloPro, which to the best of our knowledge the first fram… ▽ More

    Submitted 20 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: IEEE International Conference on Intelligent Robots and Systems (IROS) 2023

  14. arXiv:2309.01236  [pdf, other

    cs.CV cs.RO

    BodySLAM++: Fast and Tightly-Coupled Visual-Inertial Camera and Human Motion Tracking

    Authors: Dorian F. Henning, Christopher Choi, Simon Schaefer, Stefan Leutenegger

    Abstract: Robust, fast, and accurate human state - 6D pose and posture - estimation remains a challenging problem. For real-world applications, the ability to estimate the human state in real-time is highly desirable. In this paper, we present BodySLAM++, a fast, efficient, and accurate human and camera state estimation framework relying on visual-inertial data. BodySLAM++ extends an existing visual-inertia… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: IROS 2023. Video: https://youtu.be/UcutiHQwbGk

  15. arXiv:2306.11483  [pdf, other

    cs.LG

    Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning

    Authors: Anna Penzkofer, Simon Schaefer, Florian Strohm, Mihai Bâce, Stefan Leutenegger, Andreas Bulling

    Abstract: While deep reinforcement learning (RL) agents outperform humans on an increasing number of tasks, training them requires data equivalent to decades of human gameplay. Recent hierarchical RL methods have increased sample efficiency by incorporating information inherent to the structure of the decision problem but at the cost of having to discover or use human-annotated sub-goals that guide the lear… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

  16. arXiv:2306.08648  [pdf, other

    cs.CV cs.RO

    SimpleMapping: Real-Time Visual-Inertial Dense Mapping with Deep Multi-View Stereo

    Authors: Yingye Xin, Xingxing Zuo, Dongyue Lu, Stefan Leutenegger

    Abstract: We present a real-time visual-inertial dense mapping method capable of performing incremental 3D mesh reconstruction with high quality using only sequential monocular images and inertial measurement unit (IMU) readings. 6-DoF camera poses are estimated by a robust feature-based visual-inertial odometry (VIO), which also generates noisy sparse 3D map points as a by-product. We propose a sparse poin… ▽ More

    Submitted 27 August, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

  17. Incremental Dense Reconstruction from Monocular Video with Guided Sparse Feature Volume Fusion

    Authors: Xingxing Zuo, Nan Yang, Nathaniel Merrill, Binbin Xu, Stefan Leutenegger

    Abstract: Incrementally recovering 3D dense structures from monocular videos is of paramount importance since it enables various robotics and AR applications. Feature volumes have recently been shown to enable efficient and accurate incremental dense reconstruction without the need to first estimate depth, but they are not able to achieve as high of a resolution as depth-based methods due to the large memor… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 8 pages, 5 figures, RA-L 2023

  18. Finding Things in the Unknown: Semantic Object-Centric Exploration with an MAV

    Authors: Sotiris Papatheodorou, Nils Funk, Dimos Tzoumanikas, Christopher Choi, Binbin Xu, Stefan Leutenegger

    Abstract: Exploration of unknown space with an autonomous mobile robot is a well-studied problem. In this work we broaden the scope of exploration, moving beyond the pure geometric goal of uncovering as much free space as possible. We believe that for many practical applications, exploration should be contextualised with semantic and object-level understanding of the environment for task-specific exploratio… ▽ More

    Submitted 3 March, 2023; v1 submitted 28 February, 2023; originally announced February 2023.

    Comments: 7 pages, 9 figures, accepted in ICRA 2023

  19. arXiv:2210.06270  [pdf, other

    cs.CV

    Event-based Non-Rigid Reconstruction from Contours

    Authors: Yuxuan Xue, Haolong Li, Stefan Leutenegger, Jörg Stückler

    Abstract: Visual reconstruction of fast non-rigid object deformations over time is a challenge for conventional frame-based cameras. In this paper, we propose a novel approach for reconstructing such deformations using measurements from event-based cameras. Under the assumption of a static background, where all events are generated by the motion, our approach estimates the deformation of objects from events… ▽ More

    Submitted 13 November, 2022; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted for BMVC2022

  20. arXiv:2208.05067  [pdf, other

    cs.CV cs.RO

    Learning to Complete Object Shapes for Object-level Mapping in Dynamic Scenes

    Authors: Binbin Xu, Andrew J. Davison, Stefan Leutenegger

    Abstract: In this paper, we propose a novel object-level mapping system that can simultaneously segment, track, and reconstruct objects in dynamic scenes. It can further predict and complete their full geometries by conditioning on reconstructions from depth inputs and a category-level shape prior with the aim that completed object geometry leads to better object reconstruction and tracking accuracy. For ea… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: International Conference on Intelligent Robots and Systems (IROS) 2022

  21. arXiv:2208.04274  [pdf, other

    cs.RO cs.CV

    Visual-Inertial Multi-Instance Dynamic SLAM with Object-level Relocalisation

    Authors: Yifei Ren, Binbin Xu, Christopher L. Choi, Stefan Leutenegger

    Abstract: In this paper, we present a tightly-coupled visual-inertial object-level multi-instance dynamic SLAM system. Even in extremely dynamic scenes, it can robustly optimise for the camera pose, velocity, IMU biases and build a dense 3D reconstruction object-level map of the environment. Our system can robustly track and reconstruct the geometries of arbitrary objects, their semantics and motion by incr… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

    Comments: International Conference on Intelligent Robots and Systems (IROS) 2022

  22. arXiv:2208.00709  [pdf, other

    cs.RO

    Visual-Inertial SLAM with Tightly-Coupled Dropout-Tolerant GPS Fusion

    Authors: Simon Boche, Xingxing Zuo, Simon Schaefer, Stefan Leutenegger

    Abstract: Robotic applications are continuously striving towards higher levels of autonomy. To achieve that goal, a highly robust and accurate state estimation is indispensable. Combining visual and inertial sensor modalities has proven to yield accurate and locally consistent results in short-term applications. Unfortunately, visual-inertial state estimators suffer from the accumulation of drift for long-t… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

    Comments: International Conference on Intelligent Robots and Systems (IROS) 2022

  23. arXiv:2207.13464  [pdf, other

    cs.CV cs.RO

    Towards the Probabilistic Fusion of Learned Priors into Standard Pipelines for 3D Reconstruction

    Authors: Tristan Laidlow, Jan Czarnowski, Andrea Nicastro, Ronald Clark, Stefan Leutenegger

    Abstract: The best way to combine the results of deep learning with standard 3D reconstruction pipelines remains an open problem. While systems that pass the output of traditional multi-view stereo approaches to a network for regularisation or refinement currently seem to get the best results, it may be preferable to treat deep neural networks as separate components whose results can be probabilistically fu… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: Accepted at ICRA 2020

  24. arXiv:2207.12244  [pdf, other

    cs.CV cs.RO

    DeepFusion: Real-Time Dense 3D Reconstruction for Monocular SLAM using Single-View Depth and Gradient Predictions

    Authors: Tristan Laidlow, Jan Czarnowski, Stefan Leutenegger

    Abstract: While the keypoint-based maps created by sparse monocular simultaneous localisation and mapping (SLAM) systems are useful for camera tracking, dense 3D reconstructions may be desired for many robotic tasks. Solutions involving depth cameras are limited in range and to indoor spaces, and dense reconstruction systems based on minimising the photometric error between frames are typically poorly const… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted at ICRA 2019

  25. arXiv:2207.10940  [pdf, other

    cs.RO cs.CV

    Dense RGB-D-Inertial SLAM with Map Deformations

    Authors: Tristan Laidlow, Michael Bloesch, Wenbin Li, Stefan Leutenegger

    Abstract: While dense visual SLAM methods are capable of estimating dense reconstructions of the environment, they suffer from a lack of robustness in their tracking step, especially when the optimisation is poorly initialised. Sparse visual SLAM systems have attained high levels of accuracy and robustness through the inclusion of inertial measurements in a tightly-coupled fusion. Inspired by this performan… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: Accepted at IROS 2017; supplementary video available at https://youtu.be/-gUdQ0cxDh0

  26. arXiv:2205.02301  [pdf, other

    cs.CV cs.RO

    BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking

    Authors: Dorian F. Henning, Tristan Laidlow, Stefan Leutenegger

    Abstract: Estimating human motion from video is an active research area due to its many potential applications. Most state-of-the-art methods predict human shape and posture estimates for individual images and do not leverage the temporal information available in video. Many "in the wild" sequences of human motion are captured by a moving camera, which adds the complication of conflated camera and human mot… ▽ More

    Submitted 24 July, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: ECCV 2022. Video: https://youtu.be/0-SL3VeWEvU

  27. arXiv:2205.01823  [pdf, other

    cs.RO

    Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

    Authors: Nathaniel Merrill, Yuliang Guo, Xingxing Zuo, Xinyu Huang, Stefan Leutenegger, Xi Peng, Liu Ren, Guoquan Huang

    Abstract: We propose a keypoint-based object-level SLAM framework that can provide globally consistent 6DoF pose estimates for symmetric and asymmetric objects alike. To the best of our knowledge, our system is among the first to utilize the camera pose information from SLAM to provide prior knowledge for tracking keypoints on symmetric objects -- ensuring that new measurements are consistent with the curre… ▽ More

    Submitted 13 July, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted to CVPR2022

  28. arXiv:2103.16442  [pdf, other

    cs.CV

    SIMstack: A Generative Shape and Instance Model for Unordered Object Stacks

    Authors: Zoe Landgraf, Raluca Scona, Tristan Laidlow, Stephen James, Stefan Leutenegger, Andrew J. Davison

    Abstract: By estimating 3D shape and instances from a single view, we can capture information about an environment quickly, without the need for comprehensive scanning and multi-view fusion. Solving this task for composite scenes (such as object stacks) is challenging: occluded areas are not only ambiguous in shape but also in instance segmentation; multiple decompositions could be valid. We observe that ph… ▽ More

    Submitted 26 September, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Journal ref: ICCV 2021

  29. arXiv:2103.15875  [pdf, other

    cs.CV

    In-Place Scene Labelling and Understanding with Implicit Scene Representation

    Authors: Shuaifeng Zhi, Tristan Laidlow, Stefan Leutenegger, Andrew J. Davison

    Abstract: Semantic labelling is highly correlated with geometry and radiance reconstruction, as scene entities with similar shape and appearance are more likely to come from similar classes. Recent implicit neural reconstruction techniques are appealing as they do not require prior training data, but the same fully self-supervised approach is not possible for semantics because labels are human-defined prope… ▽ More

    Submitted 21 August, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: Camera ready version. To be published in Proceedings of IEEE International Conference on Computer Vision (ICCV 2021) as Oral Presentation. Project page with more videos: https://shuaifengzhi.com/Semantic-NeRF/

  30. Volumetric Occupancy Mapping With Probabilistic Depth Completion for Robotic Navigation

    Authors: Marija Popovic, Florian Thomas, Sotiris Papatheodorou, Nils Funk, Teresa Vidal-Calleja, Stefan Leutenegger

    Abstract: In robotic applications, a key requirement for safe and efficient motion planning is the ability to map obstacle-free space in unknown, cluttered 3D environments. However, commodity-grade RGB-D cameras commonly used for sensing fail to register valid depth values on shiny, glossy, bright, or distant surfaces, leading to missing data in the map. To address this issue, we propose a framework leverag… ▽ More

    Submitted 22 March, 2021; v1 submitted 5 December, 2020; originally announced December 2020.

    Comments: 8 pages, 10 figures, submission to IEEE Robotics and Automation Letters (revised)

  31. Elastic and Efficient LiDAR Reconstruction for Large-Scale Exploration Tasks

    Authors: Yiduo Wang, Nils Funk, Milad Ramezani, Sotiris Papatheodorou, Marija Popovic, Marco Camurri, Stefan Leutenegger, Maurice Fallon

    Abstract: We present an efficient, elastic 3D LiDAR reconstruction framework which can reconstruct up to maximum LiDAR ranges (60 m) at multiple frames per second, thus enabling robot exploration in large-scale environments. Our approach only requires a CPU. We focus on three main challenges of large-scale reconstruction: integration of long-range LiDAR scans at high frequency, the capacity to deform the re… ▽ More

    Submitted 9 April, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 7 pages, 7 figures

  32. Multi-Resolution 3D Mapping with Explicit Free Space Representation for Fast and Accurate Mobile Robot Motion Planning

    Authors: Nils Funk, Juan Tarrio, Sotiris Papatheodorou, Marija Popovic, Pablo F. Alcantarilla, Stefan Leutenegger

    Abstract: With the aim of bridging the gap between high quality reconstruction and mobile robot motion planning, we propose an efficient system that leverages the concept of adaptive-resolution volumetric mapping, which naturally integrates with the hierarchical decomposition of space in an octree data structure. Instead of a Truncated Signed Distance Function (TSDF), we adopt mapping of occupancy probabili… ▽ More

    Submitted 30 January, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

    Comments: 8 pages, 9 figures, 5 tables

  33. arXiv:2008.13504  [pdf, other

    cs.CV cs.RO

    Deep Probabilistic Feature-metric Tracking

    Authors: Binbin Xu, Andrew J. Davison, Stefan Leutenegger

    Abstract: Dense image alignment from RGB-D images remains a critical issue for real-world applications, especially under challenging lighting conditions and in a wide baseline setting. In this paper, we propose a new framework to learn a pixel-wise deep feature map and a deep feature-metric uncertainty map predicted by a Convolutional Neural Network (CNN), which together formulate a deep probabilistic featu… ▽ More

    Submitted 25 November, 2020; v1 submitted 31 August, 2020; originally announced August 2020.

    Comments: RAL 2020. 8 pages, 9 figures, video link: https://youtu.be/6pMosl6ZAPE

  34. arXiv:2006.02116  [pdf, other

    cs.RO

    Aerial Manipulation Using Hybrid Force and Position NMPC Applied to Aerial Writing

    Authors: Dimos Tzoumanikas, Felix Graule, Qingyue Yan, Dhruv Shah, Marija Popovic, Stefan Leutenegger

    Abstract: Aerial manipulation aims at combining the manoeuvrability of aerial vehicles with the manipulation capabilities of robotic arms. This, however, comes at the cost of the additional control complexity due to the coupling of the dynamics of the two systems. In this paper we present a NMPC specifically designed for MAVs equipped with a robotic arm. We formulate a hybrid control model for the combined… ▽ More

    Submitted 3 June, 2020; originally announced June 2020.

    Comments: Accepted for publication in Robotics: Science and Systems (RSS) 2020. Video: https://youtu.be/iE--MO0YF0o

  35. arXiv:2003.03134  [pdf, other

    cs.CV cs.DC

    Bundle Adjustment on a Graph Processor

    Authors: Joseph Ortiz, Mark Pupilli, Stefan Leutenegger, Andrew J. Davison

    Abstract: Graph processors such as Graphcore's Intelligence Processing Unit (IPU) are part of the major new wave of novel computer architecture for AI, and have a general design with massively parallel computation, distributed on-chip memory and very high inter-core communication bandwidth which allows breakthrough performance for message passing algorithms on arbitrary graphs. We show for the first time th… ▽ More

    Submitted 30 March, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: Published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2020). Video: https://www.youtube.com/watch?v=TqeN8aQNgd0

  36. arXiv:2002.10342  [pdf, other

    cs.CV

    Comparing View-Based and Map-Based Semantic Labelling in Real-Time SLAM

    Authors: Zoe Landgraf, Fabian Falck, Michael Bloesch, Stefan Leutenegger, Andrew Davison

    Abstract: Generally capable Spatial AI systems must build persistent scene representations where geometric models are combined with meaningful semantic labels. The many approaches to labelling scenes can be divided into two clear groups: view-based which estimate labels from the input view-wise data and then incrementally fuse them into the scene model as it is built; and map-based which label the generated… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: ICRA 2020

  37. arXiv:2002.07705  [pdf, other

    cs.CV cs.RO

    Towards Bounding-Box Free Panoptic Segmentation

    Authors: Ujwal Bonde, Pablo F. Alcantarilla, Stefan Leutenegger

    Abstract: In this work we introduce a new Bounding-Box Free Network (BBFNet) for panoptic segmentation. Panoptic segmentation is an ideal problem for proposal-free methods as it already requires per-pixel semantic class labels. We use this observation to exploit class boundaries from off-the-shelf semantic segmentation networks and refine them to predict instance labels. Towards this goal BBFNet predicts co… ▽ More

    Submitted 27 July, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: 15 pages, 6 figures

  38. arXiv:2002.06598  [pdf, other

    cs.RO

    Nonlinear MPC with Motor Failure Identification and Recovery for Safe and Aggressive Multicopter Flight

    Authors: Dimos Tzoumanikas, Qingyue Yan, Stefan Leutenegger

    Abstract: Safe and precise reference tracking is a crucial characteristic of MAVs that have to operate under the influence of external disturbances in cluttered environments. In this paper, we present a NMPC that exploits the fully physics based non-linear dynamics of the system. We furthermore show how the moment and thrust control inputs can be transformed into feasible actuator commands. In order to guar… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

    Comments: Accepted in the International Conference on Robotics and Automation (ICRA) 2020. 7 (6 + 1) pages. Video link: https://youtu.be/cAQeSZ3tIqY

  39. Fast Frontier-based Information-driven Autonomous Exploration with an MAV

    Authors: Anna Dai, Sotiris Papatheodorou, Nils Funk, Dimos Tzoumanikas, Stefan Leutenegger

    Abstract: Exploration and collision-free navigation through an unknown environment is a fundamental task for autonomous robots. In this paper, a novel exploration strategy for Micro Aerial Vehicles (MAVs) is presented. The goal of the exploration strategy is the reduction of map entropy regarding occupancy probabilities, which is reflected in a utility function to be maximised. We achieve fast and efficient… ▽ More

    Submitted 13 February, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

    Comments: Accepted in the International Conference on Robotics and Automation (ICRA) 2020, 7 pages, 8 figures, for the accompanying video see https://youtu.be/tH2VkVony38

  40. arXiv:1904.08405  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Event-based Vision: A Survey

    Authors: Guillermo Gallego, Tobi Delbruck, Garrick Orchard, Chiara Bartolozzi, Brian Taba, Andrea Censi, Stefan Leutenegger, Andrew Davison, Joerg Conradt, Kostas Daniilidis, Davide Scaramuzza

    Abstract: Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of… ▽ More

    Submitted 8 August, 2020; v1 submitted 17 April, 2019; originally announced April 2019.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020

  41. arXiv:1903.06482  [pdf, other

    cs.CV cs.LG

    SceneCode: Monocular Dense Semantic Reconstruction using Learned Encoded Scene Representations

    Authors: Shuaifeng Zhi, Michael Bloesch, Stefan Leutenegger, Andrew J. Davison

    Abstract: Systems which incrementally create 3D semantic maps from image sequences must store and update representations of both geometry and semantic entities. However, while there has been much work on the correct formulation for geometrical estimation, state-of-the-art systems usually rely on simple semantic representations which store and update independent label estimates for each surface element (dept… ▽ More

    Submitted 18 March, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

    Comments: To be published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019)

  42. arXiv:1903.00987  [pdf, other

    cs.CV

    X-Section: Cross-Section Prediction for Enhanced RGBD Fusion

    Authors: Andrea Nicastro, Ronald Clark, Stefan Leutenegger

    Abstract: Detailed 3D reconstruction is an important challenge with application to robotics, augmented and virtual reality, which has seen impressive progress throughout the past years. Advancements were driven by the availability of depth cameras (RGB-D), as well as increased compute power, e.g.\ in the form of GPUs -- but also thanks to inclusion of machine learning in the process. Here, we propose X-Sect… ▽ More

    Submitted 12 August, 2019; v1 submitted 3 March, 2019; originally announced March 2019.

  43. arXiv:1812.07976  [pdf, other

    cs.RO cs.CV

    MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM

    Authors: Binbin Xu, Wenbin Li, Dimos Tzoumanikas, Michael Bloesch, Andrew Davison, Stefan Leutenegger

    Abstract: We propose a new multi-instance dynamic RGB-D SLAM system using an object-level octree-based volumetric representation. It can provide robust camera tracking in dynamic environments and at the same time, continuously estimate geometric, semantic, and motion properties for arbitrary objects in the scene. For each incoming frame, we perform instance segmentation to detect objects and refine mask bou… ▽ More

    Submitted 21 March, 2019; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: Accepted to International Conference on Robotics and Automation (ICRA) 2019. 7 (6 + 1) pages. Please also see video Link: https://youtu.be/gturboNl9gg

  44. arXiv:1809.02966  [pdf, other

    cs.CV

    LS-Net: Learning to Solve Nonlinear Least Squares for Monocular Stereo

    Authors: Ronald Clark, Michael Bloesch, Jan Czarnowski, Stefan Leutenegger, Andrew J. Davison

    Abstract: Sum-of-squares objective functions are very popular in computer vision algorithms. However, these objective functions are not always easy to optimize. The underlying assumptions made by solvers are often not satisfied and many problems are inherently ill-posed. In this paper, we propose LS-Net, a neural nonlinear least squares optimization algorithm which learns to effectively optimize these cost… ▽ More

    Submitted 9 September, 2018; originally announced September 2018.

    Comments: ECCV 2018. Video: https://youtu.be/5bZbMm8UqbA

  45. arXiv:1809.00716  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset

    Authors: Wenbin Li, Sajad Saeedi, John McCormac, Ronald Clark, Dimos Tzoumanikas, Qing Ye, Yuzhong Huang, Rui Tang, Stefan Leutenegger

    Abstract: Datasets have gained an enormous amount of popularity in the computer vision community, from training and evaluation of Deep Learning-based methods to benchmarking Simultaneous Localization and Mapping (SLAM). Without a doubt, synthetic imagery bears a vast potential due to scalability in terms of amounts of data obtainable without tedious manual ground truth annotations or measurements. Here, we… ▽ More

    Submitted 3 September, 2018; originally announced September 2018.

    Comments: British Machine Vision Conference (BMVC) 2018

  46. arXiv:1808.08378  [pdf, other

    cs.CV

    Fusion++: Volumetric Object-Level SLAM

    Authors: John McCormac, Ronald Clark, Michael Bloesch, Andrew J. Davison, Stefan Leutenegger

    Abstract: We propose an online object-level SLAM system which builds a persistent and accurate 3D graph map of arbitrary reconstructed objects. As an RGB-D camera browses a cluttered indoor scene, Mask-RCNN instance segmentations are used to initialise compact per-object Truncated Signed Distance Function (TSDF) reconstructions with object size-dependent resolutions and a novel 3D foreground mask. Reconstru… ▽ More

    Submitted 28 August, 2018; v1 submitted 25 August, 2018; originally announced August 2018.

  47. arXiv:1807.10561  [pdf, other

    cs.CV cs.HC cs.RO

    Towards an Embodied Semantic Fovea: Semantic 3D scene reconstruction from ego-centric eye-tracker videos

    Authors: Mickey Li, Noyan Songur, Pavel Orlov, Stefan Leutenegger, A Aldo Faisal

    Abstract: Incorporating the physical environment is essential for a complete understanding of human behavior in unconstrained every-day tasks. This is especially important in ego-centric tasks where obtaining 3 dimensional information is both limiting and challenging with the current 2D video analysis methods proving insufficient. Here we demonstrate a proof-of-concept system which provides real-time 3D map… ▽ More

    Submitted 27 July, 2018; originally announced July 2018.

  48. arXiv:1804.00874  [pdf, other

    cs.CV cs.LG

    CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM

    Authors: Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison

    Abstract: The representation of geometry in real-time 3D perception systems continues to be a critical research issue. Dense maps capture complete surface shape and can be augmented with semantic labels, but their high dimensionality makes them computationally costly to store and process, and unsuitable for rigorous probabilistic inference. Sparse feature-based representations avoid these problems, but capt… ▽ More

    Submitted 14 April, 2019; v1 submitted 3 April, 2018; originally announced April 2018.

    Comments: Published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018)

  49. arXiv:1708.08844  [pdf, other

    cs.CV

    Semantic Texture for Robust Dense Tracking

    Authors: Jan Czarnowski, Stefan Leutenegger, Andrew Davison

    Abstract: We argue that robust dense SLAM systems can make valuable use of the layers of features coming from a standard CNN as a pyramid of `semantic texture' which is suitable for dense alignment while being much more robust to nuisance factors such as lighting than raw RGB values. We use a straightforward Lucas-Kanade formulation of image alignment, with a schedule of iterations over the coarse-to-fine l… ▽ More

    Submitted 29 August, 2017; originally announced August 2017.

  50. arXiv:1612.05079  [pdf, other

    cs.CV

    SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth

    Authors: John McCormac, Ankur Handa, Stefan Leutenegger, Andrew J. Davison

    Abstract: We introduce SceneNet RGB-D, expanding the previous work of SceneNet to enable large scale photorealistic rendering of indoor scene trajectories. It provides pixel-perfect ground truth for scene understanding problems such as semantic segmentation, instance segmentation, and object detection, and also for geometric computer vision problems such as optical flow, depth estimation, camera pose estima… ▽ More

    Submitted 30 January, 2017; v1 submitted 15 December, 2016; originally announced December 2016.