Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

On Scene Engineering and Domain Randomization: Synthetic Data for Industrial Item Picking

  • Conference paper
  • First Online:
Intelligent Autonomous Systems 17 (IAS 2022)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 577))

Included in the following conference series:

  • 1272 Accesses

Abstract

Synthetic data for training deep neural networks is increasingly used in computer vision. Different strategies, such as domain randomization or domain adaptation, exist to bridge the gap between synthetic training data and the real application. Despite recent progress and gain in knowledge in this area, the following question remains: How much adjustment to reality is required and which degree of randomization is useful for transferring precise object detectors to real use cases? In this paper, we present a detailed study with more than 100 datasets and 2,700 trained convolutional neural networks (CNNs), comparing the influence of different degrees of manual optimization (scene engineering) and domain randomization techniques. To distinguish precision and robustness, the trained object detectors are evaluated on different domain shifts with respect to scene environment and object appearance. Using the example of robot-based industrial item picking, we show that the scene context and structure as well as realistic textures are crucial for the simulation to reality transfer. The combination with well-chosen randomization parameters, especially lighting and distractor objects, improves the robustness of the CNNs at higher domain shifts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: Common Objects in Context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  2. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. pp. 248–255. IEEE, Piscataway, NJ (2009)

    Google Scholar 

  3. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2. pp. 2672–2680. NIPS’14, MIT Press, Cambridge, MA, USA (2014)

    Google Scholar 

  4. Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017)

    Google Scholar 

  5. Weisenboehler, M., Wurll, C.: Automated item picking for fashion articles using deep learning. In: ISR 2020; 52th International Symposium on Robotics, pp. 1–8 (2020)

    Google Scholar 

  6. Tobin, J., Biewald, L., Duan, R., Andrychowicz, M., Handa, A., Kumar, V., McGrew, B., Ray, A., Schneider, J., Welinder, P., Zaremba, W., Abbeel, P.: Domain randomization and generative models for robotic grasping. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3482–3489 (2018)

    Google Scholar 

  7. Fang, K., Bai, Y., Hinterstoisser, S., Savarese, S., Kalakrishnan, M.: Multi-task domain adaptation for deep learning of instance grasping from simulation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 3516–3523 (2018)

    Google Scholar 

  8. Zuo, G., Zhang, C., Liu, H., Gong, D.: Low-quality rendering-driven 6d object pose estimation from single RGB image. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020)

    Google Scholar 

  9. James, S., Davison, A.J., Johns, E.: Transferring end-to-end visuomotor control from simulation to real world for a multi-stage task. arXiv:1707.02267 (2017)

  10. James, S., Wohlhart, P., Kalakrishnan, M., Kalashnikov, D., Irpan, A., Ibarz, J., Levine, S., Hadsell, R., Bousmalis, K.: Sim-to-real via sim-to-sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12619–12629 (2019)

    Google Scholar 

  11. Ho, D., Rao, K., Xu, Z., Jang, E., Khansari, M., Bai, Y.: Retinagan: an object-aware approach to sim-to-real transfer. arXiv:2011.03148 (2020)

  12. Sadeghi, F., Levine, S.: Cad2rl: Real single-image flight without a single real image. In: Amato, N., Systems, R.S.A. (eds.) Robotics: Science and System XIII. Robotics: Science and Systems Online Proceedings (2017)

    Google Scholar 

  13. Sadeghi, F., Toshev, A., Jang, E., Levine, S.: Sim2real viewpoint invariant visual servoing by recurrent control. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018), pp. 4691–4699. IEEE, Piscataway, NJ (2018)

    Google Scholar 

  14. Georgakis, G., Mousavian, A., Berg, A., Kosecka, J.: Synthesizing training data for object detection in indoor scenes. In: Amato, N., Systems, R.S.a. (eds.) Robotics: Science and System XIII. Robotics: Science and Systems Online Proceedings (2017)

    Google Scholar 

  15. Hinterstoisser, S., Lepetit, V., Wohlhart, P., Konolige, K.: On Pre-trained Image Features and Synthetic Images for Deep Learning. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11129, pp. 682–697. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11009-3_42

  16. Hinterstoisser, S., Pauly, O., Heibel, H., Martina, M., Bokeloh, M.: An annotation saved is an annotation earned: Using fully synthetic training for object detection. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pp. 2787–2796 (2019)

    Google Scholar 

  17. Dehban, A., Borrego, J., Figueiredo, R., Moreno, P., Bernardino, A., Santos-Victor, J.: The impact of domain randomization on object detection: a case study on parametric shapes and synthetic textures. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2593–2600. IEEE, [Piscataway, NJ] (2019)

    Google Scholar 

  18. Lee, Y., Chuang, C., Lai, S., Jhang, Z.: Automatic generation of photorealistic training data for detection of industrial components. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 2751–2755 (2019)

    Google Scholar 

  19. Noh, S., Back, S., Kang, R., Shin, S., Lee, K.: Automatic detection and identification of fasteners with simple visual calibration using synthetic data. In: 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), vol. 1, pp. 38–44 (2020)

    Google Scholar 

  20. Magaña, A., Wu, H., Bauer, P., Reinhart, G.: Posenetwork: pipeline for the automated generation of synthetic training data and cnn for object detection, segmentation, and orientation estimation. In: 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), vol. 1, pp. 587–594 (2020)

    Google Scholar 

  21. Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtualworlds as proxy for multi-object tracking analysis. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), pp. 4340–4349. IEEE, Piscataway, NJ (2016)

    Google Scholar 

  22. Johnson-Roberson, M., Barto, C., Mehta, R., Sridhar, S.N., Rosaen, K., Vasudevan, R.: Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks? In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 746–753. IEEE Computer Society, Danvers (MA) (2017)

    Google Scholar 

  23. Tsirikoglou, A., Kronander, J., Wrenninge, M., Unger, J.: Procedural modeling and physically based rendering for synthetic data generation in automotive applications. arXiv:1710.06270 (2017)

  24. Abu Alhaija, H., Karthik Mustikovela, S., Mescheder, L., Geiger, A., Rother, C.: Augmented reality meets computer vision: efficient data generation for urban driving scenes. Int. J. Comput. Vis. 126(9), 961–972 (2018)

    Article  Google Scholar 

  25. Tremblay, J., Prakash, A., Acuna, D., Brophy, M., Jampani, V., Anil, C., To, T., Cameracci, E., Boochoon, S., Birchfield, S.: Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2018). pp. 1082–10828. IEEE, Piscataway, NJ (2018)

    Google Scholar 

  26. Prakash, A., Boochoon, S., Brophy, M., Acuna, D., Cameracci, E., State, G., Shapira, O., Birchfield, S.: Structured domain randomization: bridging the reality gap by context-aware synthetic data. In: 2019 IEEE International Conference on Robotics and Automation (ICRA), pp. 7249–7255 (2019)

    Google Scholar 

  27. Peng, X., Sun, B., Ali, K., Saenko, K.: Learning deep object detectors from 3d models. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1278–1286. IEEE (2015)

    Google Scholar 

  28. Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2686–2694. IEEE (2015)

    Google Scholar 

  29. Rudorfer, M., Neumann, L., Krüger, J.: Towards learning 3d object detection and 6d pose estimation from synthetic data. In: 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1540–1543 (2019)

    Google Scholar 

  30. Dwibedi, D., Misra, I., Hebert, M.: Cut, paste and learn: Surprisingly easy synthesis for instance detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1310–1319 (2017)

    Google Scholar 

  31. Ruiz, N., Schulter, S., Chandraker, M.: Learning to simulate. In: International Conference on Learning Representations (2019). https://openreview.net/forum?id=HJgkx2Aqt7

  32. Khirodkar, R., Yoo, D., Kitani, K.M.: Vadra: visual adversarial domain randomization and augmentation. arXiv:1812.00491 (2018)

  33. Kar, A., Prakash, A., Liu, M.Y., Cameracci, E., Yuan, J., Rusiniak, M., Acuna, D., Torralba, A., Fidler, S.: Meta-sim: learning to generate synthetic datasets. In: 2019 International Conference on Computer Vision. pp. 4550–4559. IEEE, Piscataway, NJ (2019)

    Google Scholar 

  34. Bousmalis, K., Irpan, A., Wohlhart, P., Bai, Y., Kelcey, M., Kalakrishnan, M., Downs, L., Ibarz, J., Pastor, P., Konolige, K., Levine, S., Vanhoucke, V.: Using simulation and domain adaptation to improve efficiency of deep robotic grasping. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 4243–4250. IEEE (2018)

    Google Scholar 

  35. Hoffman, J., Tzeng, E., Park, T., Zhu, J.Y., Isola, P., Saenko, K., Efros, A., Darrell, T.: Cycada: cycle-consistent adversarial domain adaptation. In: International Conference on Machine Learning, pp. 1989–1998 (2018). https://proceedings.mlr.press/v80/hoffman18a.html

  36. Nogues, F.C., Huie, A., Dasgupta, S.: Object detection using domain randomization and generative adversarial refinement of synthetic images. arXiv:1805.11778 (2018)

  37. Park, N., Mohammadi, M., Gorde, K., Jajodia, S., Park, H., Kim, Y.: Data synthesis based on generative adversarial networks. Proc. VLDB Endow. 11(10), 1071–1083 (2018). https://doi.org/10.14778/3231751.3231757

  38. Lin, Y., Suzuki, K., Takeda, H., Nakamura, K.: Generating synthetic training data for object detection using multi-task generative adversarial networks. ISPRS Ann. Photogrammetry, Rem. Sens. Spat. Inf. Sci. 2, 443–449 (2020)

    Google Scholar 

  39. Rao, K., Harris, C., Irpan, A., Levine, S., Ibarz, J., Khansari, M.: Rl-cyclegan: reinforcement learning aware simulation-to-real. In: Proceedings, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11154–11163. IEEE Computer Society, Conference Publishing Services, Los Alamitos, California (2020)

    Google Scholar 

  40. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)

    Article  Google Scholar 

  41. Dörr, L., Brandt, F., Meyer, A., Pouls, M.: Lean training data generation for planar object detection models in unsteady logistics contexts. In: 2019 18th IEEE International Conference On Machine Learning and Applications (ICMLA), pp. 329–334 (2019)

    Google Scholar 

  42. Sarkar, K., Varanasi, K., Stricker, D.: Trained 3d models for CNN based object recognition. In: Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, pp. 130–137. SCITEPRESS - Science and Technology Publications (2017)

    Google Scholar 

  43. Khirodkar, R., Yoo, D., Kitani, K.: Domain randomization for scene-specific car detection and pose estimation. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1932–1940, IEEE (2019)

    Google Scholar 

  44. Grün, S., Höninger, S., Scheikl, P.M., Hein, B., Kröger, T.: Evaluation of domain randomization techniques for transfer learning. In: 2019 19th International Conference on Advanced Robotics (ICAR), pp. 481–486 (2019)

    Google Scholar 

  45. Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., Murphy, K.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Staff, I. (ed.) 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3296–3297. IEEE, Piscataway, NJ (July (2017)

    Chapter  Google Scholar 

  46. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  47. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. pp. 4278–4284. AAAI’17, AAAI Press (2017)

    Google Scholar 

  48. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: Single Shot MultiBox Detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

  49. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018). pp. 4510–4520. IEEE, Piscataway, NJ (2018)

    Google Scholar 

  50. Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Staff, I. (ed.) 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936–944. IEEE, Piscataway, NJ (2017)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moritz Weisenböhler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Weisenböhler, M., Hein, B., Wurll, C. (2023). On Scene Engineering and Domain Randomization: Synthetic Data for Industrial Item Picking. In: Petrovic, I., Menegatti, E., Marković, I. (eds) Intelligent Autonomous Systems 17. IAS 2022. Lecture Notes in Networks and Systems, vol 577. Springer, Cham. https://doi.org/10.1007/978-3-031-22216-0_43

Download citation

Publish with us

Policies and ethics