Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints

  • Chapter
  • First Online:
Algorithmic Foundations of Robotics XII

Abstract

Real-world robotics problems often occur in domains that differ significantly from the robot’s prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world.

Eric Tzeng, Coline Devin: Authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,” Journal of Machine Learning Research, vol. 17, 2016.

    Google Scholar 

  2. E. Tzeng, J. Hoffman, T. Darrell, and K. Saenko, “Simultaneous deep transfer across domains and tasks,” in International Conference in Computer Vision (ICCV), 2015.

    Google Scholar 

  3. Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in International Conference in Machine Learning (ICML), 2015.

    Google Scholar 

  4. K. Saenko, B. Kulis, M. Fritz, and T. Darrell, “Adapting visual category models to new domains,” in Proc. ECCV, 2010.

    Google Scholar 

  5. H. D. III, “Frustratingly easy domain adaptation,” ACL, vol. 45, pp. 256–263, 2007.

    Google Scholar 

  6. K. Lai and D. Fox, “3d laser scan classification using web data and domain adaptation.” in Robotics: Science and Systems, 2009, 2009.

    Google Scholar 

  7. A. Saxena, J. Driemeyer, and A. Y. Ng, “Robotic grasping of novel objects using vision,” The International Journal of Robotics Research, vol. 27, no. 2, pp. 157–173, 2008.

    Google Scholar 

  8. R. Brooks, R. Greiner, and T. Binford, “The acronym model-based vision system,” in International Joint Conference on Artificial Intelligence 6, 1979, pp. 105–113.

    Google Scholar 

  9. D. Pomerleau, “ALVINN: an autonomous land vehicle in a neural network,” in Advances in Neural Information Processing Systems (NIPS), 1989.

    Google Scholar 

  10. G. Shakhnarovich, P. Viola, and T. Darrell, “Fast pose estimation with parametersensitive hashing,” in Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. IEEE, 2003, pp.750–757.

    Google Scholar 

  11. R. Urtasun and T. Darrell, “Sparse probabilistic regression for activity-independent human pose inference,” in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, June 2008, pp. 1–8.

    Google Scholar 

  12. G. W. Taylor, R. Fergus, G. Williams, I. Spiro, and C. Bregler, “Pose-sensitive embedding by nonlinear nca regression,” in Advances in Neural Information Processing Systems 23, J. Lafferty, C.Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, Eds. Curran Associates, Inc., 2010, pp. 2280–2288.

    Google Scholar 

  13. A. Toshev and C. Szegedy, “Deeppose: Human pose estimation via deep neural networks,” CoRR, vol. abs/1312.4659, 2013.

    Google Scholar 

  14. J. J. Tompson, A. Jain, Y. Lecun, and C. Bregler, “Joint training of a convolutional network and a graphical model for human pose estimation,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 1799–1807.

    Google Scholar 

  15. R. Gopalan, R. Li, and R. Chellappa, “Domain adaptation for object recognition: An unsupervised approach,” in Proc. ICCV, 2011.

    Google Scholar 

  16. B. Gong, Y. Shi, F. Sha, and K. Grauman, “Geodesic flow kernel for unsupervised domain adaptation,” in Proc. CVPR, 2012.

    Google Scholar 

  17. J. Yang, R. Yan, and A. G. Hauptmann, “Cross-domain video concept detection using adaptive svms,” ACM Multimedia, 2007.

    Google Scholar 

  18. Y. Aytar and A. Zisserman, “Tabula rasa: Model transfer for object category detection,” in IEEE International Conference on Computer Vision, 2011.

    Google Scholar 

  19. L. Duan, D. Xu, and I. W. Tsang, “Learning with augmented features for heterogeneous domain adaptation,” in Proc. ICML, 2012.

    Google Scholar 

  20. J. Hoffman, E. Rodner, J. Donahue, K. Saenko, and T. Darrell, “Efficient learning of domain-invariant image representations,” in International Conference on Learning Representations, 2013.

    Google Scholar 

  21. E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, “Deep domain confusion: Maximizing for domain invariance,” CoRR, vol. abs/1412.3474, 2014.

    Google Scholar 

  22. M. Long, Y. Cao, J. Wang, and M. I. Jordan, “Learning transferable features with deep adaptation networks,” in International Conference in Machine Learning (ICML), 2015.

    Google Scholar 

  23. Y. Mansour, M. Mohri, and A. Rostamizadeh, “Domain adaptation: Learning bounds and algorithms,” in COLT, 2009.

    Google Scholar 

  24. B. Sun and K. Saenko, “From virtual to reality: Fast adaptation of virtual object detectors to real domains,” in British Machine Vision Conference (BMVC), 2014.

    Google Scholar 

  25. X. Peng, B. Sun, K. Ali, and K. Saenko, “Exploring invariances in deep convolutional neural networks using synthetic images,” CoRR, vol. abs/1412.7122, 2014. [Online]. Available: http://arxiv.org/abs/1412.7122

  26. V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing Atari with deep reinforcement learning,” NIPS ’13 Workshop on Deep Learning, 2013.

    Google Scholar 

  27. J. Schulman, S. Levine, P. Moritz, M. Jordan, and P. Abbeel, “Trust region policy optimization,” in International Conference on Machine Learning (ICML), 2015.

    Google Scholar 

  28. T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprintarXiv:1509.02971, 2015.

  29. B. Kulis, K. Saenko, and T. Darrell, “What you saw is not what you get: Domain adaptation using asymmetric kernel transforms,” in Proc. CVPR, 2011.

    Google Scholar 

  30. J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, “Signature verification using a “siamese” time delay neural network,” in Advances in Neural Information Processing Systems 6, J. Cowan, G. Tesauro, and J. Alspector, Eds. Morgan-Kaufmann, 1994, pp. 737–744.

    Google Scholar 

  31. S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification,” in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp. 539–546.

    Google Scholar 

  32. R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” in Proc. Computer Vision and Pattern Recognition Conference (CVPR’06). IEEE Press, 2006.

    Google Scholar 

  33. M. Riedmiller, S. Lange, and A. Voigtlaender, “Autonomous reinforcement learning on raw visual input data in a real world application,” in International Joint Conference on Neural Networks, 2012.

    Google Scholar 

  34. M. Watter, J. Springenberg, J. Boedecker, and M. Riedmiller, “Embed to control: a locally linear latent dynamics model for control from raw images,” in Advances in Neural Information Processing Systems (NIPS), 2015.

    Google Scholar 

  35. F. Zhang, J. Leitner, M. Milford, B. Upcroft, and P. Corke, “Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control,” ArXiv e-prints, Nov. 2015.

    Google Scholar 

  36. E. Tzeng, C. Devin, J. Hoffman, C. Finn, X. Peng, S. Levine, K. Saenko, and T. Darrell, “Towards adapting deep visuomotor representations from simulated to real environments,” CoRR, vol. abs/1511.07111, 2015. [Online]. Available: http://arxiv.org/abs/1511.07111

  37. S. Daftry, J. A. Bagnell, and M. Hebert, “Learning transferable policies for monocular reactive mav control,” in International Symposium on Experimental Robotics, 2016.

    Google Scholar 

  38. Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” arXiv preprintarXiv:1408.5093, 2014.

  39. C. Finn, X. Tan, Y. Duan, T. Darrell, S. Levine, and P. Abbeel, “Deep spatial autoencoders for visuomotor learning,” in International Conference on Robotics and Automation (ICRA), 2016.

    Google Scholar 

  40. P. Pastor, M. Kalakrishnan, J. Binney, J. Kelly, L. Righetti, G. Sukhatme, and S. Schaal, “Learning task error models for manipulation,” in IEEE International Conference on Robotics and Automation, 2013.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pieter Abbeel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Tzeng, E. et al. (2020). Adapting Deep Visuomotor Representations with Weak Pairwise Constraints. In: Goldberg, K., Abbeel, P., Bekris, K., Miller, L. (eds) Algorithmic Foundations of Robotics XII. Springer Proceedings in Advanced Robotics, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-030-43089-4_44

Download citation

Publish with us

Policies and ethics