Darius Burschka received his PhD degree in Electrical and Computer Engineering in 1998 from the Technische Universitätt München in the field of vision-based navigation and map generation with binocular stereo systems. In 1999, he was a Postdoctoral Associate at Yale University, Connecticut, where he worked on laser-based map generation and landmark selection from video images for vision-based navigation systems. From 1999 to 2003, he was an Associate Research Scientist at the Johns Hopkins University, Baltimore, Maryland. Later 2003 to 2005, he was an Assistant Research Professor in Computer Science at the Johns Hopkins University. Currently, he is an Associate Professor in Computer Science at the Technische Universität München, Germany, where he heads the Machine Vision and Perception group. He was an area coordinator in the DFG Cluster of Excellence ``Cognition in Technical Systems'' and is currently a co-chair of the IEEE RAS Technical Committee on Computer and Robot Vision.
His areas of research are sensor systems for mobile and medical robots and human computer interfaces. The focus of his research is on vision-based navigation and three-dimensional reconstruction from sensor data. He is a Senior Member of IEEE. Phone: +498928917638
Autonomous vehicles need to be able to perceive both the presence and motion of objects in the su... more Autonomous vehicles need to be able to perceive both the presence and motion of objects in the surrounding environment to navigate in the real world. In this work, we propose to solve the tasks of identifying objects and estimating the corresponding motion by viewing them as a single unified task known as instance flow. Instance flow provides the pixel-wise instance mask of an object and the dense optical flow within it. To achieve this, we extended the state of the art object detection model to include a dense optical flow estimator. The estimator is used to estimate the optical flow for each region of interest only, instead of the entire image. We tested the approach by carrying out experiments on publicly available datasets for autonomous driving research, VKITTI, KITTI and HD1K. Furthermore, we also introduced a new instance flow quality metric to evaluate the instance flow estimation.
Self-driving vehicles require detailed information of the surrounding environment to drive autono... more Self-driving vehicles require detailed information of the surrounding environment to drive autonomously in complex urban scenarios, especially traffic junction crossing. Currently, most self-driving and driver assistance systems depend strongly on GPS and backend high definition map for information about traffic junctions. In this work, we would like to look into the possibility of identifying traffic junctions using onboard cameras by formulating it as a segmentation task. To tackle this, we first analyzed how a junction should be defined in image space, and then used it to extend the Cityscapes dataset with a new Junction class label. We took the extended dataset and trained segmentation models to segment out traffic junction within an image. The models were able to achieve an overall mean Intersection-over-Union mIoU of 73.8% for multi-class semantic segmentation and Intersectionover-Union IoU of 58.7% for Junction. This has the potential to improve self-driving vehicles that depend strongly on a high definition map by providing an alternative source of information for navigation. Finally, we introduced an algorithm operating in sensor space to determine how strong the vehicle should decelerate in order to stop prior to the traffic junction based on the segmentation results.
2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
We present a novel representation of dynamic scenes perceived by moving agents. This type of envi... more We present a novel representation of dynamic scenes perceived by moving agents. This type of environments require constant updates of the map information that uses conventional geometric representations due to the changing relative position of dynamic object to the static scene. We show that this changed representation of the environment increases the robustness and accuracy in the perception of the scene and simplifies significantly the processing and complexity of the perception module. At the same time, the changed representation allows also a better prioritization of attention to the moving object around the robot that takes into account not the Euclidean distance but the time-to-interaction (TTI), which is the core contribution of this approach. We present the mathematical framework behind the pro-posed representation and show examples, how this framework simplifies and robustifies the processing in the perception modules of a moving agent.
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
The need for comprehensive telemedicine solutions is becoming increasingly relevant due to challe... more The need for comprehensive telemedicine solutions is becoming increasingly relevant due to challenges associated with the ageing population, the increasing shortage of health-care providers, and, more recently, the global pandemic. Existing solutions primarily focus on, e.g., electronic medical records, audiovisual connections, and, in some cases, robotic systems with very basic capabilities. Here we present a fundamentally new, holistic approach to a remote doctor visit, which enables transparent remote examination, anomaly detection, diagnosis, and rehabilitation. Our dual doctor-patient twin paradigm involves two robotic systems: one representing the doctor to the patient ("GARMI") and one representing the patient to the doctor ("MUCKI"). Through bidirectional telepresence control, this system enables transparent, natural, remote haptic interaction between doctor and patient. The control, interaction, and knowledge transfer to the doctor is enhanced by AI-based visual motion and facial expression analysis as well as a digital twin of the patient. Thus, each stage of a doctor visit can be replicated in the context of telemedicine and shared autonomy: from first assessment to observation-based and remote physical examination, to a better-informed doctor diagnosis and robot-assisted telerehabilitation.
2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), 2017
This paper proposes a novel velocity estimation approach for single image points, while both the ... more This paper proposes a novel velocity estimation approach for single image points, while both the camera and the observed object may be in motion. In order to obtain absolute velocities, a flow-stereo setup is utilized, but no explicit depth reconstruction is carried out. Omitting the calculation in the Cartesian space increases the robustness of the estimation, which is shown in a comparison between the error propagation of the proposed method and a disparity-based approach. An important factor is the maximum distance in which the velocity could be estimated. Simulations and tests using real images demonstrate that the estimation range could be increased by the proposed method. Possible applications are motion-planning or obstacle avoidance for ground-based mobile robots in dynamic environments.
2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019
We propose an approach to classify possible interaction types between autonomous vehicles and ped... more We propose an approach to classify possible interaction types between autonomous vehicles and pedestrians based on the idea of resource competition in shared spaces. Autonomous vehicles are more challenged in urban traffic scenarios as lots of uncertainties influence the current world model. Urban environments impose very little constraints on the motion of pedestrians. This creates the demand for an approach to determine intentions for each pedestrian as far ahead as possible and to react to changes early. A motion model based on goal-driven pedestrian movement shows a set of most likely planned trajectories. These are analyzed for overlapping occupation times in road segments, thus interactions with the vehicle. The output is an early estimation which suggests most probable interaction types and places. From this estimation, current trajectory of the pedestrian is used to refine the prediction of the most probable intention of interaction place and type. In the end the algorithm combines topological and behavioral input to infer and validate long term intention of interaction type before being able to actually infer the interaction from current dynamics.In terms of a proof-of-concept, the applicability of the approach is validated on real world scenarios from the Cityscapes data set.
Most current perception systems are designed to represent static geometry of the environment and ... more Most current perception systems are designed to represent static geometry of the environment and to monitor the execution of their tasks in 3D Cartesian representations. While this representation allows a human-readable definition of tasks in robotic systems and provides direct references to the static environment representation, it does not correspond to the native data format of many of the passive sensor systems. Additional calibration parameters are necessary to transform the sensor data into the Cartesian space. They decrease the robustness of the perception system making them sensitive to changes and errors. An example of an alternative coupling strategy for perception modules is the shift from look-then-move to visual servoing in grasping, where 3D task planning is replaced by a task definition defined directly in the image space. The errors and goals are represented here directly in sensor space. In addition, the spatial ordering of the information based on Cartesian data ma...
Autonomous vehicles need to be able to perceive both the presence and motion of objects in the su... more Autonomous vehicles need to be able to perceive both the presence and motion of objects in the surrounding environment to navigate in the real world. In this work, we propose to solve the tasks of identifying objects and estimating the corresponding motion by viewing them as a single unified task known as instance flow. Instance flow provides the pixel-wise instance mask of an object and the dense optical flow within it. To achieve this, we extended the state of the art object detection model to include a dense optical flow estimator. The estimator is used to estimate the optical flow for each region of interest only, instead of the entire image. We tested the approach by carrying out experiments on publicly available datasets for autonomous driving research, VKITTI, KITTI and HD1K. Furthermore, we also introduced a new instance flow quality metric to evaluate the instance flow estimation.
Self-driving vehicles require detailed information of the surrounding environment to drive autono... more Self-driving vehicles require detailed information of the surrounding environment to drive autonomously in complex urban scenarios, especially traffic junction crossing. Currently, most self-driving and driver assistance systems depend strongly on GPS and backend high definition map for information about traffic junctions. In this work, we would like to look into the possibility of identifying traffic junctions using onboard cameras by formulating it as a segmentation task. To tackle this, we first analyzed how a junction should be defined in image space, and then used it to extend the Cityscapes dataset with a new Junction class label. We took the extended dataset and trained segmentation models to segment out traffic junction within an image. The models were able to achieve an overall mean Intersection-over-Union mIoU of 73.8% for multi-class semantic segmentation and Intersectionover-Union IoU of 58.7% for Junction. This has the potential to improve self-driving vehicles that depend strongly on a high definition map by providing an alternative source of information for navigation. Finally, we introduced an algorithm operating in sensor space to determine how strong the vehicle should decelerate in order to stop prior to the traffic junction based on the segmentation results.
2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
We present a novel representation of dynamic scenes perceived by moving agents. This type of envi... more We present a novel representation of dynamic scenes perceived by moving agents. This type of environments require constant updates of the map information that uses conventional geometric representations due to the changing relative position of dynamic object to the static scene. We show that this changed representation of the environment increases the robustness and accuracy in the perception of the scene and simplifies significantly the processing and complexity of the perception module. At the same time, the changed representation allows also a better prioritization of attention to the moving object around the robot that takes into account not the Euclidean distance but the time-to-interaction (TTI), which is the core contribution of this approach. We present the mathematical framework behind the pro-posed representation and show examples, how this framework simplifies and robustifies the processing in the perception modules of a moving agent.
2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
The need for comprehensive telemedicine solutions is becoming increasingly relevant due to challe... more The need for comprehensive telemedicine solutions is becoming increasingly relevant due to challenges associated with the ageing population, the increasing shortage of health-care providers, and, more recently, the global pandemic. Existing solutions primarily focus on, e.g., electronic medical records, audiovisual connections, and, in some cases, robotic systems with very basic capabilities. Here we present a fundamentally new, holistic approach to a remote doctor visit, which enables transparent remote examination, anomaly detection, diagnosis, and rehabilitation. Our dual doctor-patient twin paradigm involves two robotic systems: one representing the doctor to the patient ("GARMI") and one representing the patient to the doctor ("MUCKI"). Through bidirectional telepresence control, this system enables transparent, natural, remote haptic interaction between doctor and patient. The control, interaction, and knowledge transfer to the doctor is enhanced by AI-based visual motion and facial expression analysis as well as a digital twin of the patient. Thus, each stage of a doctor visit can be replicated in the context of telemedicine and shared autonomy: from first assessment to observation-based and remote physical examination, to a better-informed doctor diagnosis and robot-assisted telerehabilitation.
2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), 2017
This paper proposes a novel velocity estimation approach for single image points, while both the ... more This paper proposes a novel velocity estimation approach for single image points, while both the camera and the observed object may be in motion. In order to obtain absolute velocities, a flow-stereo setup is utilized, but no explicit depth reconstruction is carried out. Omitting the calculation in the Cartesian space increases the robustness of the estimation, which is shown in a comparison between the error propagation of the proposed method and a disparity-based approach. An important factor is the maximum distance in which the velocity could be estimated. Simulations and tests using real images demonstrate that the estimation range could be increased by the proposed method. Possible applications are motion-planning or obstacle avoidance for ground-based mobile robots in dynamic environments.
2019 IEEE Intelligent Transportation Systems Conference (ITSC), 2019
We propose an approach to classify possible interaction types between autonomous vehicles and ped... more We propose an approach to classify possible interaction types between autonomous vehicles and pedestrians based on the idea of resource competition in shared spaces. Autonomous vehicles are more challenged in urban traffic scenarios as lots of uncertainties influence the current world model. Urban environments impose very little constraints on the motion of pedestrians. This creates the demand for an approach to determine intentions for each pedestrian as far ahead as possible and to react to changes early. A motion model based on goal-driven pedestrian movement shows a set of most likely planned trajectories. These are analyzed for overlapping occupation times in road segments, thus interactions with the vehicle. The output is an early estimation which suggests most probable interaction types and places. From this estimation, current trajectory of the pedestrian is used to refine the prediction of the most probable intention of interaction place and type. In the end the algorithm combines topological and behavioral input to infer and validate long term intention of interaction type before being able to actually infer the interaction from current dynamics.In terms of a proof-of-concept, the applicability of the approach is validated on real world scenarios from the Cityscapes data set.
Most current perception systems are designed to represent static geometry of the environment and ... more Most current perception systems are designed to represent static geometry of the environment and to monitor the execution of their tasks in 3D Cartesian representations. While this representation allows a human-readable definition of tasks in robotic systems and provides direct references to the static environment representation, it does not correspond to the native data format of many of the passive sensor systems. Additional calibration parameters are necessary to transform the sensor data into the Cartesian space. They decrease the robustness of the perception system making them sensitive to changes and errors. An example of an alternative coupling strategy for perception modules is the shift from look-then-move to visual servoing in grasping, where 3D task planning is replaced by a task definition defined directly in the image space. The errors and goals are represented here directly in sensor space. In addition, the spatial ordering of the information based on Cartesian data ma...
Uploads
Papers by Darius Burschka