On Scene Engineering and Domain Randomization: Synthetic Data for Industrial Item Picking

Weisenböhler, Moritz; Hein, Björn; Wurll, Christian

doi:10.1007/978-3-031-22216-0_43

Moritz Weisenböhler^12,13,
Björn Hein^12,13 &
Christian Wurll¹²

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 577))

Included in the following conference series:

International Conference on Intelligent Autonomous Systems

1272 Accesses

Abstract

Synthetic data for training deep neural networks is increasingly used in computer vision. Different strategies, such as domain randomization or domain adaptation, exist to bridge the gap between synthetic training data and the real application. Despite recent progress and gain in knowledge in this area, the following question remains: How much adjustment to reality is required and which degree of randomization is useful for transferring precise object detectors to real use cases? In this paper, we present a detailed study with more than 100 datasets and 2,700 trained convolutional neural networks (CNNs), comparing the influence of different degrees of manual optimization (scene engineering) and domain randomization techniques. To distinguish precision and robustness, the trained object detectors are evaluated on different domain shifts with respect to scene environment and object appearance. Using the example of robot-based industrial item picking, we show that the scene context and structure as well as realistic textures are crucial for the simulation to reality transfer. The combination with well-chosen randomization parameters, especially lighting and distractor objects, improves the robustness of the CNNs at higher domain shifts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SORDI.ai: large-scale synthetic object recognition dataset generation for industries

Article 09 July 2024

Visual Data Simulation for Deep Learning in Robot Manipulation Tasks

Generating Synthetic Training Data for Assembly Processes

References

Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: Common Objects in Context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp. 248–255. IEEE, Piscataway, NJ (2009)
Google Scholar
Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. pp. 2672–2680. NIPS’14, MIT Press, Cambridge, MA, USA (2014)
Google Scholar
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)
Google Scholar
Weisenboehler, M., Wurll, C.: Automated item picking for fashion articles using deep learning. In: ISR 2020; 52th International Symposium on Robotics, pp. 1–8 (2020)
Google Scholar
Tobin, J., Biewald, L., Duan, R., Andrychowicz, M., Handa, A., Kumar, V., McGrew, B., Ray, A., Schneider, J., Welinder, P., Zaremba, W., Abbeel, P.: Domain randomization and generative models for robotic grasping. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3482–3489 (2018)
Google Scholar
Fang, K., Bai, Y., Hinterstoisser, S., Savarese, S., Kalakrishnan, M.: Multi-task domain adaptation for deep learning of instance grasping from simulation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 3516–3523 (2018)
Google Scholar
Zuo, G., Zhang, C., Liu, H., Gong, D.: Low-quality rendering-driven 6d object pose estimation from single RGB image. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020)
Google Scholar
James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. arXiv:1707.02267 (2017)
James, S., Wohlhart, P., Kalakrishnan, M., Kalashnikov, D., Irpan, A., Ibarz, J., Levine, S., Hadsell, R., Bousmalis, K.: Sim-to-real via sim-to-sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12619–12629 (2019)
Google Scholar
Ho, D., Rao, K., Xu, Z., Jang, E., Khansari, M., Bai, Y.: Retinagan: an object-aware approach to sim-to-real transfer. arXiv:2011.03148 (2020)
Sadeghi, F., Levine, S.: Cad2rl: Real single-image flight without a single real image. In: Amato, N., Systems, R.S.A. (eds.) Robotics: Science and System XIII. Robotics: Science and Systems Online Proceedings (2017)
Google Scholar
Sadeghi, F., Toshev, A., Jang, E., Levine, S.: Sim2real viewpoint invariant visual servoing by recurrent control. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 4691–4699. IEEE, Piscataway, NJ (2018)
Google Scholar
Georgakis, G., Mousavian, A., Berg, A., Kosecka, J.: Synthesizing training data for object detection in indoor scenes. In: Amato, N., Systems, R.S.a. (eds.) Robotics: Science and System XIII. Robotics: Science and Systems Online Proceedings (2017)
Google Scholar
Hinterstoisser, S., Lepetit, V., Wohlhart, P., Konolige, K.: On Pre-trained Image Features and Synthetic Images for Deep Learning. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 682–697. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_42
Hinterstoisser, S., Pauly, O., Heibel, H., Martina, M., Bokeloh, M.: An annotation saved is an annotation earned: Using fully synthetic training for object detection. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 2787–2796 (2019)
Google Scholar
Dehban, A., Borrego, J., Figueiredo, R., Moreno, P., Bernardino, A., Santos-Victor, J.: The impact of domain randomization on object detection: a case study on parametric shapes and synthetic textures. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2593–2600. IEEE, [Piscataway, NJ] (2019)
Google Scholar
Lee, Y., Chuang, C., Lai, S., Jhang, Z.: Automatic generation of photorealistic training data for detection of industrial components. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 2751–2755 (2019)
Google Scholar
Noh, S., Back, S., Kang, R., Shin, S., Lee, K.: Automatic detection and identification of fasteners with simple visual calibration using synthetic data. In: 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), vol. 1, pp. 38–44 (2020)
Google Scholar
Magaña, A., Wu, H., Bauer, P., Reinhart, G.: Posenetwork: pipeline for the automated generation of synthetic training data and cnn for object detection, segmentation, and orientation estimation. In: 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), vol. 1, pp. 587–594 (2020)
Google Scholar
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtualworlds as proxy for multi-object tracking analysis. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), pp. 4340–4349. IEEE, Piscataway, NJ (2016)
Google Scholar
Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 746–753. IEEE Computer Society, Danvers (MA) (2017)
Google Scholar
Tsirikoglou, A., Kronander, J., Wrenninge, M., Unger, J.: Procedural modeling and physically based rendering for synthetic data generation in automotive applications. arXiv:1710.06270 (2017)
Abu Alhaija, H., Karthik Mustikovela, S., Mescheder, L., Geiger, A., Rother, C.: Augmented reality meets computer vision: efficient data generation for urban driving scenes. Int. J. Comput. Vis. 126(9), 961–972 (2018)
Article Google Scholar
Tremblay, J., Prakash, A., Acuna, D., Brophy, M., Jampani, V., Anil, C., To, T., Cameracci, E., Boochoon, S., Birchfield, S.: Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2018). pp. 1082–10828. IEEE, Piscataway, NJ (2018)
Google Scholar
Prakash, A., Boochoon, S., Brophy, M., Acuna, D., Cameracci, E., State, G., Shapira, O., Birchfield, S.: Structured domain randomization: bridging the reality gap by context-aware synthetic data. In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp. 7249–7255 (2019)
Google Scholar
Peng, X., Sun, B., Ali, K., Saenko, K.: Learning deep object detectors from 3d models. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1278–1286. IEEE (2015)
Google Scholar
Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2686–2694. IEEE (2015)
Google Scholar
Rudorfer, M., Neumann, L., Krüger, J.: Towards learning 3d object detection and 6d pose estimation from synthetic data. In: 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1540–1543 (2019)
Google Scholar
Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1310–1319 (2017)
Google Scholar
Ruiz, N., Schulter, S., Chandraker, M.: Learning to simulate. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=HJgkx2Aqt7
Khirodkar, R., Yoo, D., Kitani, K.M.: Vadra: visual adversarial domain randomization and augmentation. arXiv:1812.00491 (2018)
Kar, A., Prakash, A., Liu, M.Y., Cameracci, E., Yuan, J., Rusiniak, M., Acuna, D., Torralba, A., Fidler, S.: Meta-sim: learning to generate synthetic datasets. In: 2019 International Conference on Computer Vision. pp. 4550–4559. IEEE, Piscataway, NJ (2019)
Google Scholar
Bousmalis, K., Irpan, A., Wohlhart, P., Bai, Y., Kelcey, M., Kalakrishnan, M., Downs, L., Ibarz, J., Pastor, P., Konolige, K., Levine, S., Vanhoucke, V.: Using simulation and domain adaptation to improve efficiency of deep robotic grasping. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4243–4250. IEEE (2018)
Google Scholar
Hoffman, J., Tzeng, E., Park, T., Zhu, J.Y., Isola, P., Saenko, K., Efros, A., Darrell, T.: Cycada: cycle-consistent adversarial domain adaptation. In: International Conference on Machine Learning, pp. 1989–1998 (2018). https://proceedings.mlr.press/v80/hoffman18a.html
Nogues, F.C., Huie, A., Dasgupta, S.: Object detection using domain randomization and generative adversarial refinement of synthetic images. arXiv:1805.11778 (2018)
Park, N., Mohammadi, M., Gorde, K., Jajodia, S., Park, H., Kim, Y.: Data synthesis based on generative adversarial networks. Proc. VLDB Endow. 11(10), 1071–1083 (2018). https://doi.org/10.14778/3231751.3231757
Lin, Y., Suzuki, K., Takeda, H., Nakamura, K.: Generating synthetic training data for object detection using multi-task generative adversarial networks. ISPRS Ann. Photogrammetry, Rem. Sens. Spat. Inf. Sci. 2, 443–449 (2020)
Google Scholar
Rao, K., Harris, C., Irpan, A., Levine, S., Ibarz, J., Khansari, M.: Rl-cyclegan: reinforcement learning aware simulation-to-real. In: Proceedings, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11154–11163. IEEE Computer Society, Conference Publishing Services, Los Alamitos, California (2020)
Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Article Google Scholar
Dörr, L., Brandt, F., Meyer, A., Pouls, M.: Lean training data generation for planar object detection models in unsteady logistics contexts. In: 2019 18th IEEE International Conference On Machine Learning and Applications (ICMLA), pp. 329–334 (2019)
Google Scholar
Sarkar, K., Varanasi, K., Stricker, D.: Trained 3d models for CNN based object recognition. In: Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 130–137. SCITEPRESS - Science and Technology Publications (2017)
Google Scholar
Khirodkar, R., Yoo, D., Kitani, K.: Domain randomization for scene-specific car detection and pose estimation. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1932–1940, IEEE (2019)
Google Scholar
Grün, S., Höninger, S., Scheikl, P.M., Hein, B., Kröger, T.: Evaluation of domain randomization techniques for transfer learning. In: 2019 19th International Conference on Advanced Robotics (ICAR), pp. 481–486 (2019)
Google Scholar
Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., Murphy, K.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Staff, I. (ed.) 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3296–3297. IEEE, Piscataway, NJ (July (2017)
Chapter Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. pp. 4278–4284. AAAI’17, AAAI Press (2017)
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: Single Shot MultiBox Detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018). pp. 4510–4520. IEEE, Piscataway, NJ (2018)
Google Scholar
Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Staff, I. (ed.) 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936–944. IEEE, Piscataway, NJ (2017)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Karlsruhe University of Applied Sciences, Karlsruhe, Germany
Moritz Weisenböhler, Björn Hein & Christian Wurll
Karlsruhe Institute of Technology, Karlsruhe, Germany
Moritz Weisenböhler & Björn Hein

Authors

Moritz Weisenböhler
View author publications
You can also search for this author in PubMed Google Scholar
Björn Hein
View author publications
You can also search for this author in PubMed Google Scholar
Christian Wurll
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moritz Weisenböhler .

Editor information

Editors and Affiliations

Faculty of Electrical Engineering, University of Zagreb, Zagreb, Croatia
Ivan Petrovic
Department of Information Engineering, University of Padua, Padua, Italy
Emanuele Menegatti
Faculty of Electrical Engineering, University of Zagreb, Zagreb, Croatia
Ivan Marković

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weisenböhler, M., Hein, B., Wurll, C. (2023). On Scene Engineering and Domain Randomization: Synthetic Data for Industrial Item Picking. In: Petrovic, I., Menegatti, E., Marković, I. (eds) Intelligent Autonomous Systems 17. IAS 2022. Lecture Notes in Networks and Systems, vol 577. Springer, Cham. https://doi.org/10.1007/978-3-031-22216-0_43

Download citation

DOI: https://doi.org/10.1007/978-3-031-22216-0_43
Published: 18 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22215-3
Online ISBN: 978-3-031-22216-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

On Scene Engineering and Domain Randomization: Synthetic Data for Industrial Item Picking

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SORDI.ai: large-scale synthetic object recognition dataset generation for industries

Visual Data Simulation for Deep Learning in Robot Manipulation Tasks

Generating Synthetic Training Data for Assembly Processes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

On Scene Engineering and Domain Randomization: Synthetic Data for Industrial Item Picking

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SORDI.ai: large-scale synthetic object recognition dataset generation for industries

Visual Data Simulation for Deep Learning in Robot Manipulation Tasks

Generating Synthetic Training Data for Assembly Processes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation