research-article

Open access

Learning to fly: computational controller design for hybrid UAVs with reinforcement learning

Authors:

Michael Foshey,

Adriana Schulz, and

Wojciech MatusikAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 38, Issue 4

Article No.: 42, Pages 1 - 12

https://doi.org/10.1145/3306346.3322940

Published: 12 July 2019 Publication History

Abstract

Hybrid unmanned aerial vehicles (UAV) combine advantages of multicopters and fixed-wing planes: vertical take-off, landing, and low energy use. However, hybrid UAVs are rarely used because controller design is challenging due to its complex, mixed dynamics. In this paper, we propose a method to automate this design process by training a mode-free, model-agnostic neural network controller for hybrid UAVs. We present a neural network controller design with a novel error convolution input trained by reinforcement learning. Our controller exhibits two key features: First, it does not distinguish among flying modes, and the same controller structure can be used for copters with various dynamics. Second, our controller works for real models without any additional parameter tuning process, closing the gap between virtual simulation and real fabrication. We demonstrate the efficacy of the proposed controller both in simulation and in our custom-built hybrid UAVs (Figure 1, 8). The experiments show that the controller is robust to exploit the complex dynamics when both rotors and wings are active in flight tests.

References

[1]

Pieter Abbeel, Adam Coates, and Andrew Y. Ng. 2010. Autonomous Helicopter Aerobatics through Apprenticeship Learning. The International Journal of Robotics Research 29, 13 (Nov. 2010), 1608--1639.

Digital Library

[2]

Pieter Abbeel, Adam Coates, Morgan Quigley, and Andrew Y. Ng. 2007. An Application of Reinforcement Learning to Aerobatic Helicopter Flight. In Proceedings of the 19th International Conference on Neural Information Processing Systems (NIPS '07). 1--8. http://dl.acm.org/citation.cfm?id=2976456.2976457

Digital Library

[3]

Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, et al. 2018. Learning Dexterous In-Hand Manipulation. Retrieved April 22, 2019 from https://arxiv.org/abs/1808.00177

[4]

ArduPilot. 2016. ArduPilot Open Source Autopilot. Retrieved April 22, 2019 from http://ardupilot.org/

[5]

Karl Johan Aström and Richard M. Murray. 2010. Feedback Systems: an Introduction for Scientists and Engineers. Princeton University Press.

Digital Library

[6]

Somil Bansal, Anayo K. Akametalu, Frank J. Jiang, Forrest Laine, and Claire J. Tomlin. 2016. Learning Quadrotor Dynamics Using Neural Network for Flight Control. In Proceedings of 2016 IEEE 55th Conference on Decision and Control (CDC '16). 4653--4660.

[7]

Roman Bapst, Robin Ritz, Lorenz Meier, and Marc Pollefeys. 2015. Design and Implementation of an Unmanned Tail-sitter. In 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '15). IEEE, 1885--1890.

[8]

Andrew Barry. 2016. High-Speed Autonomous Obstacle Avoidance with Pushbroom Stereo. Ph.D. Dissertation. Massachusetts Institute of Technology, Cambridge, MA.

[9]

Gaurav Bharaj, David I. W. Levin, James Tompkin, Yun Fei, Hanspeter Pfister, Wojciech Matusik, and Changxi Zheng. 2015. Computational Design of Metallophone Contact Sounds. ACM Trans. Graph. 34, 6, Article 223 (Oct. 2015), 13 pages.

Digital Library

[10]

Adam Coates, Pieter Abbeel, and Andrew Y. Ng. 2008. Learning for Control from Multiple Demonstrations. In Proceedings of the 25th International Conference on Machine Learning (ICML '08). ACM, New York, NY, USA, 144--151.

Digital Library

[11]

Stelian Coros, Bernhard Thomaszewski, Gioacchino Noris, Shinjiro Sueda, Moira Forberg, Robert W. Sumner, Wojciech Matusik, and Bernd Bickel. 2013. Computational Design of Mechanical Characters. ACM Trans. Graph. 32, 4, Article 83 (July 2013), 12 pages.

Digital Library

[12]

Rick Cory and Russ Tedrake. 2008. Experiments in Fixed-Wing UAV Perching. In AIAA Guidance, Navigation and Control Conference and Exhibit. 7256.

[13]

Ruta Desai, Ye Yuan, and Stelian Coros. 2017. Computational Abstractions for Interactive Design of Robotic Devices. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 1196--1203.

[14]

Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, and Sergey Levine. 2017. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 2169--2176.

[15]

Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, Yuhuai Wu, and Peter Zhokhov. 2017. OpenAI Baselines. Retrieved April 22, 2019 from https://github.com/openai/baselines

[16]

Tao Du, Adriana Schulz, Bo Zhu, Bernd Bickel, and Wojciech Matusik. 2016. Computational Multicopter Design. ACM Trans. Graph. 35, 6, Article 227 (Nov. 2016), 10 pages.

Digital Library

[17]

Justin Fu, Sergey Levine, and Pieter Abbeel. 2016. One-shot Learning of manipulation skills with online dynamics adaptation and neural network priors. In 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '16). IEEE, 4019--4026.

Digital Library

[18]

Moritz Geilinger, Roi Poranne, Ruta Desai, Bernhard Thomaszewski, and Stelian Coros. 2018. Skaterbots: Optimization-based Design and Motion Synthesis for Robotic Creatures with Legs and Wheels. ACM Trans. Graph. 37, 4, Article 160 (July 2018), 12 pages.

Digital Library

[19]

Nicolas Heess, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, Ali Eslami, Martin Riedmiller, et al. 2017. Emergence of Locomotion Behaviours in Rich Environments. arXiv preprint arXiv:1707.02286 (2017).

[20]

R. Hugh Stone, Peter Anderson, Colin Hutchison, Allen Tsai, Peter Gibbens, and K C. Wong. 2008. Flight Testing of the T-Wing Tail-Sitter Unmanned Air Vehicle. Journal of Aircraft - J AIRCRAFT 45 (Mar. 2008), 673--685.

[21]

Jemin Hwangbo, Inkyu Sa, Roland Siegwart, and Marco Hutter. 2017. Control of a Quadrotor With Reinforcement Learning. IEEE Robotics and Automation Letters 2, 4 (Oct. 2017), 2096--2103.

[22]

H Jin Kim, Michael I Jordan, Shankar Sastry, and Andrew Y Ng. 2004. Autonomous helicopter flight via reinforcement learning. In Advances in neural information processing systems (NIPS '04). 799--806.

Digital Library

[23]

Libin Liu and Jessica Hodgins. 2018. Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning. ACM Trans. Graph. 37, 4, Article 142 (July 2018), 14 pages.

Digital Library

[24]

Vittorio Megaro, Bernhard Thomaszewski, Maurizio Nitti, Otmar Hilliges, Markus Gross, and Stelian Coros. 2015. Interactive Design of 3D-printable Robotic Creatures. ACM Trans. Graph. 34, 6, Article 216 (Oct. 2015), 9 pages.

Digital Library

[25]

Lorenz Meier, Petri Tanskanen, Lionel Heng, Gim Hee Lee, Friedrich Fraundorfer, and Marc Pollefeys. 2012. PIXHAWK: A Micro Aerial Vehicle Design for Autonomous Flight using Onboard Computer Vision. Autonomous Robots 33, 1--2 (2012), 21--39.

Digital Library

[26]

OnShape. 2019. https://www.onshape.com/.

[27]

Atsushi Oosedo, Satoko Abiko, Atsushi Konno, Takuya Koizumi, Tatuya Furui, and Masaru Uchiyama. 2013. Development of a quad rotor tail-sitter VTOL UAV without control surfaces and experimental verification. In 2013 IEEE International Conference on Robotics and Automation (ICRA '13). 317--322.

[28]

Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel van de Panne. 2018a. Deep-Mimic: Example-guided Deep Reinforcement Learning of Physics-based Character Skills. ACM Trans. Graph. 37, 4, Article 143 (July 2018), 14 pages.

Digital Library

[29]

Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. 2018b. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization. In 2018 IEEE International Conference on Robotics and Automation (ICRA '18). 1--8.

[30]

Xue Bin Peng, Glen Berseth, Kangkang Yin, and Michiel Van De Panne. 2017. DeepLoco: Dynamic Locomotion Skills Using Hierarchical Deep Reinforcement Learning. ACM Trans. Graph. 36, 4, Article 41 (July 2017), 13 pages.

Digital Library

[31]

Robin Ritz and Raffaello D'Andrea. 2017. A Global Controller for Flying Wing Tailsitter Vehicles. In 2017 IEEE International Conference on Robotics and Automation (ICRA '17). 2731--2738.

[32]

Fereshteh Sadeghi and Sergey Levine. 2017. CAD2RL: Real Single-Image Flight Without a Single Real Image. In Proceedings of Robotics: Science and Systems. Cambridge, Massachusetts.

[33]

Charles Schaff, David Yunis, Ayan Chakrabarti, and Matthew R Walter. 2018. Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning. arXiv preprint arXiv:1801.01432 (2018).

[34]

John Schulman, Sergey Levine, Pieter Abbeel, Michael Jordan, and Philipp Moritz. 2015. Trust Region Policy Optimization. In International Conference on Machine Learning (ICML '15). 1889--1897.

Digital Library

[35]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017).

[36]

Adriana Schulz, Jie Xu, Bo Zhu, Changxi Zheng, Eitan Grinspun, and Wojciech Matusik. 2017. Interactive Design Space Exploration and Optimization for CAD Models. ACM Trans. Graph. 36, 4, Article 157 (July 2017), 14 pages.

Digital Library

[37]

Jie Tan, Tingnan Zhang, Erwin Coumans, Atil Iscen, Yunfei Bai, Danijar Hafner, Steven Bohez, and Vincent Vanhoucke. 2018. Sim-to-Real: Learning Agile Locomotion For Quadruped Robots. In Proceedings of Robotics: Science and Systems. Pittsburgh, Pennsylvania.

[38]

Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. In 2017 IEEE/RSJ International Conferenceon Intelligent Robots and Systems (IROS '17). IEEE, 23--30.

[39]

Nobuyuki Umetani, Takeo Igarashi, and Niloy J Mitra. 2012. Guided Exploration of Physically Valid Shapes for Furniture Design. ACM Trans. Graph. 31, 4 (2012), 86--1.

Digital Library

[40]

Nobuyuki Umetani, Yuki Koyama, Ryan Schmidt, and Takeo Igarashi. 2014. Pteromys: Interactive Design and Optimization of Free-formed Free-flight Model Airplanes. ACM Trans. Graph. 33, 4, Article 65 (July 2014), 10 pages.

Digital Library

[41]

Jungdam Won, Jongho Park, Kwanyu Kim, and Jehee Lee. 2017. How to Train Your Dragon: Example-guided Control of Flapping Flight. ACM Trans. Graph. 36, 6, Article 198 (Nov. 2017), 13 pages.

Digital Library

[42]

Jungdam Won, Jungnam Park, and Jehee Lee. 2018. Aerobatics Control of Flying Creatures via Self-regulated Learning. ACM Trans. Graph. 37, 6, Article 181 (Dec. 2018), 10 pages.

Digital Library

Cited By

Weakly JLi XAgarwal TLi MFolk SJiang CSung C(2024)Bistable Aerial Transformer: A Quadrotor Fixed-Wing Hybrid That Morphs Dynamically Via Passive Soft MechanismJournal of Mechanisms and Robotics10.1115/1.406515916:7Online publication date: 24-Apr-2024
https://doi.org/10.1115/1.4065159
Eschmann JAlbani DLoianno G(2024)Learning to Fly in SecondsIEEE Robotics and Automation Letters10.1109/LRA.2024.33960259:7(6336-6343)Online publication date: Jul-2024
https://doi.org/10.1109/LRA.2024.3396025
Zhang CZou WCheng NZhang S(2024)Towards Jumping Skill Learning by Target-guided Policy Optimization for Quadruped RobotsMachine Intelligence Research10.1007/s11633-023-1429-5Online publication date: 22-Feb-2024
https://doi.org/10.1007/s11633-023-1429-5
Show More Cited By

Index Terms

Learning to fly: computational controller design for hybrid UAVs with reinforcement learning
1. Computing methodologies

Recommendations

Fly like a fly

A team of scientists has built a fly-size flight simulator in an attempt to understand flight control from the perspective of the common housefly. Results so far have shown, and further tests are expected to confirm, that the fly uses a flight control ...
Read More
Learning from Innate Behaviors: A Quantitative Evaluation of Neural Network Controllers
Special issue on learning in autonomous robots

The aim was to investigate a method of developing mobile robot controllers based on ideas about how plastic neural systems adapt to their environment by extracting regularities from the amalgamated behavior of inflexible (non-plastic) innate s ubsystems ...
Read More
Learning from Innate Behaviors: A QuantitativeEvaluation of Neural Network Controllers

The aim was to investigate a method of developing mobile robot controllers based on ideas about how plastic neural systems adapt to their environment by extracting regularities from the amalgamated behavior of inflexible (nonplastic) innate subsystems ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 38, Issue 4

August 2019

1480 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3306346

Editor:
Olga Sorkine-Hornung
ETH Zurich

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2019

Published in TOG Volume 38, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Air Force Research Laboratory's sponsorship of Julia: A Fresh Approach to Technical Computing and Data Processing

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

31
Total Citations
View Citations
2,691
Total Downloads

Downloads (Last 12 months)631
Downloads (Last 6 weeks)84

Other Metrics

View Author Metrics

Citations

Cited By

Weakly JLi XAgarwal TLi MFolk SJiang CSung C(2024)Bistable Aerial Transformer: A Quadrotor Fixed-Wing Hybrid That Morphs Dynamically Via Passive Soft MechanismJournal of Mechanisms and Robotics10.1115/1.406515916:7Online publication date: 24-Apr-2024
https://doi.org/10.1115/1.4065159
Eschmann JAlbani DLoianno G(2024)Learning to Fly in SecondsIEEE Robotics and Automation Letters10.1109/LRA.2024.33960259:7(6336-6343)Online publication date: Jul-2024
https://doi.org/10.1109/LRA.2024.3396025
Zhang CZou WCheng NZhang S(2024)Towards Jumping Skill Learning by Target-guided Policy Optimization for Quadruped RobotsMachine Intelligence Research10.1007/s11633-023-1429-5Online publication date: 22-Feb-2024
https://doi.org/10.1007/s11633-023-1429-5
Herrmann LKollmannsberger S(2024)Deep learning in computational mechanics: a reviewComputational Mechanics10.1007/s00466-023-02434-4Online publication date: 13-Jan-2024
https://doi.org/10.1007/s00466-023-02434-4
Osedo AWada DHisada S(2023)Uniaxial attitude control of uncrewed aerial vehicle with thrust vectoring under model variations by deep reinforcement learning and domain randomizationROBOMECH Journal10.1186/s40648-023-00260-010:1Online publication date: 7-Aug-2023
https://doi.org/10.1186/s40648-023-00260-0
Paliwal ANguyen BTsarov AKalantari N(2023)ReShader: View-Dependent Highlights for Single Image View-SynthesisACM Transactions on Graphics10.1145/361839342:6(1-9)Online publication date: 5-Dec-2023
https://dl.acm.org/doi/10.1145/3618393
Hu YLuo JDong ZZhang L(2023)A Velocity Controller for Quadrotors Based on Reinforcement LearningArtificial Intelligence10.1007/978-981-99-9119-8_37(411-421)Online publication date: 22-Jul-2023
https://dl.acm.org/doi/10.1007/978-981-99-9119-8_37
Jin YBi YLyu CBai YZeng ZLian L(2023)Nezha‐IV: A hybrid aerial underwater vehicle in real ocean environmentsJournal of Field Robotics10.1002/rob.2227441:2(420-442)Online publication date: 27-Nov-2023
https://doi.org/10.1002/rob.22274
Piovarči MFoshey MXu JErps TBabaei VDidyk PRusinkiewicz SMatusik WBickel B(2022)Closed-loop control of direct ink writing via reinforcement learningACM Transactions on Graphics10.1145/3528223.353014441:4(1-10)Online publication date: 22-Jul-2022
https://dl.acm.org/doi/10.1145/3528223.3530144
Zhao ADu TXu JHughes JSalazar JMa PWang WRus DMatusik W(2022)Automatic Co-Design of Aerial Robots Using a Graph Grammar2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS47612.2022.9982013(11260-11267)Online publication date: 23-Oct-2022
https://doi.org/10.1109/IROS47612.2022.9982013
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents