Adapting Deep Visuomotor Representations with Weak Pairwise Constraints

Tzeng, Eric; Devin, Coline; Hoffman, Judy; Finn, Chelsea; Abbeel, Pieter; Levine, Sergey; Saenko, Kate; Darrell, Trevor

doi:10.1007/978-3-030-43089-4_44

Eric Tzeng¹⁴,
Coline Devin¹⁴,
Judy Hoffman¹⁴,
Chelsea Finn¹⁴,
Pieter Abbeel¹⁴,
Sergey Levine¹⁴,
Kate Saenko¹⁵ &
…
Trevor Darrell¹⁴

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 13))

1617 Accesses
22 Citations

Abstract

Real-world robotics problems often occur in domains that differ significantly from the robot’s prior training environment. For many robotic control tasks, real world experience is expensive to obtain, but data is easy to collect in either an instrumented environment or in simulation. We propose a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset (e.g. synthetic images) to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Supervised domain adaptation methods minimize cross-domain differences using pairs of aligned images that contain the same object or scene in both the source and target domains, thus learning a domain-invariant representation. However, they require manual alignment of such image pairs. Fully unsupervised adaptation methods rely on minimizing the discrepancy between the feature distributions across domains. We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains. Focusing on adapting from simulation to real world data using a PR2 robot, we evaluate our approach on a manipulation task and show that by using weakly paired images, our method compensates for domain shift more effectively than previous techniques, enabling better robot performance in the real world.

Eric Tzeng, Coline Devin: Authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Adapting Control Policies from Simulation to Reality Using a Pairwise Loss

Toward Human-Robot Cooperation: Unsupervised Domain Adaptation for Egocentric Action Recognition

Perspectives on Deep Multimodel Robot Learning

References

S. Levine, C. Finn, T. Darrell, and P. Abbeel, “End-to-end training of deep visuomotor policies,” Journal of Machine Learning Research, vol. 17, 2016.
Google Scholar
E. Tzeng, J. Hoffman, T. Darrell, and K. Saenko, “Simultaneous deep transfer across domains and tasks,” in International Conference in Computer Vision (ICCV), 2015.
Google Scholar
Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in International Conference in Machine Learning (ICML), 2015.
Google Scholar
K. Saenko, B. Kulis, M. Fritz, and T. Darrell, “Adapting visual category models to new domains,” in Proc. ECCV, 2010.
Google Scholar
H. D. III, “Frustratingly easy domain adaptation,” ACL, vol. 45, pp. 256–263, 2007.
Google Scholar
K. Lai and D. Fox, “3d laser scan classification using web data and domain adaptation.” in Robotics: Science and Systems, 2009, 2009.
Google Scholar
A. Saxena, J. Driemeyer, and A. Y. Ng, “Robotic grasping of novel objects using vision,” The International Journal of Robotics Research, vol. 27, no. 2, pp. 157–173, 2008.
Google Scholar
R. Brooks, R. Greiner, and T. Binford, “The acronym model-based vision system,” in International Joint Conference on Artificial Intelligence 6, 1979, pp. 105–113.
Google Scholar
D. Pomerleau, “ALVINN: an autonomous land vehicle in a neural network,” in Advances in Neural Information Processing Systems (NIPS), 1989.
Google Scholar
G. Shakhnarovich, P. Viola, and T. Darrell, “Fast pose estimation with parametersensitive hashing,” in Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on. IEEE, 2003, pp.750–757.
Google Scholar
R. Urtasun and T. Darrell, “Sparse probabilistic regression for activity-independent human pose inference,” in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, June 2008, pp. 1–8.
Google Scholar
G. W. Taylor, R. Fergus, G. Williams, I. Spiro, and C. Bregler, “Pose-sensitive embedding by nonlinear nca regression,” in Advances in Neural Information Processing Systems 23, J. Lafferty, C.Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, Eds. Curran Associates, Inc., 2010, pp. 2280–2288.
Google Scholar
A. Toshev and C. Szegedy, “Deeppose: Human pose estimation via deep neural networks,” CoRR, vol. abs/1312.4659, 2013.
Google Scholar
J. J. Tompson, A. Jain, Y. Lecun, and C. Bregler, “Joint training of a convolutional network and a graphical model for human pose estimation,” in Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Eds. Curran Associates, Inc., 2014, pp. 1799–1807.
Google Scholar
R. Gopalan, R. Li, and R. Chellappa, “Domain adaptation for object recognition: An unsupervised approach,” in Proc. ICCV, 2011.
Google Scholar
B. Gong, Y. Shi, F. Sha, and K. Grauman, “Geodesic flow kernel for unsupervised domain adaptation,” in Proc. CVPR, 2012.
Google Scholar
J. Yang, R. Yan, and A. G. Hauptmann, “Cross-domain video concept detection using adaptive svms,” ACM Multimedia, 2007.
Google Scholar
Y. Aytar and A. Zisserman, “Tabula rasa: Model transfer for object category detection,” in IEEE International Conference on Computer Vision, 2011.
Google Scholar
L. Duan, D. Xu, and I. W. Tsang, “Learning with augmented features for heterogeneous domain adaptation,” in Proc. ICML, 2012.
Google Scholar
J. Hoffman, E. Rodner, J. Donahue, K. Saenko, and T. Darrell, “Efficient learning of domain-invariant image representations,” in International Conference on Learning Representations, 2013.
Google Scholar
E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, “Deep domain confusion: Maximizing for domain invariance,” CoRR, vol. abs/1412.3474, 2014.
Google Scholar
M. Long, Y. Cao, J. Wang, and M. I. Jordan, “Learning transferable features with deep adaptation networks,” in International Conference in Machine Learning (ICML), 2015.
Google Scholar
Y. Mansour, M. Mohri, and A. Rostamizadeh, “Domain adaptation: Learning bounds and algorithms,” in COLT, 2009.
Google Scholar
B. Sun and K. Saenko, “From virtual to reality: Fast adaptation of virtual object detectors to real domains,” in British Machine Vision Conference (BMVC), 2014.
Google Scholar
X. Peng, B. Sun, K. Ali, and K. Saenko, “Exploring invariances in deep convolutional neural networks using synthetic images,” CoRR, vol. abs/1412.7122, 2014. [Online]. Available: http://arxiv.org/abs/1412.7122
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing Atari with deep reinforcement learning,” NIPS ’13 Workshop on Deep Learning, 2013.
Google Scholar
J. Schulman, S. Levine, P. Moritz, M. Jordan, and P. Abbeel, “Trust region policy optimization,” in International Conference on Machine Learning (ICML), 2015.
Google Scholar
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprintarXiv:1509.02971, 2015.
B. Kulis, K. Saenko, and T. Darrell, “What you saw is not what you get: Domain adaptation using asymmetric kernel transforms,” in Proc. CVPR, 2011.
Google Scholar
J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, “Signature verification using a “siamese” time delay neural network,” in Advances in Neural Information Processing Systems 6, J. Cowan, G. Tesauro, and J. Alspector, Eds. Morgan-Kaufmann, 1994, pp. 737–744.
Google Scholar
S. Chopra, R. Hadsell, and Y. LeCun, “Learning a similarity metric discriminatively, with application to face verification,” in Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1. IEEE, 2005, pp. 539–546.
Google Scholar
R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality reduction by learning an invariant mapping,” in Proc. Computer Vision and Pattern Recognition Conference (CVPR’06). IEEE Press, 2006.
Google Scholar
M. Riedmiller, S. Lange, and A. Voigtlaender, “Autonomous reinforcement learning on raw visual input data in a real world application,” in International Joint Conference on Neural Networks, 2012.
Google Scholar
M. Watter, J. Springenberg, J. Boedecker, and M. Riedmiller, “Embed to control: a locally linear latent dynamics model for control from raw images,” in Advances in Neural Information Processing Systems (NIPS), 2015.
Google Scholar
F. Zhang, J. Leitner, M. Milford, B. Upcroft, and P. Corke, “Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control,” ArXiv e-prints, Nov. 2015.
Google Scholar
E. Tzeng, C. Devin, J. Hoffman, C. Finn, X. Peng, S. Levine, K. Saenko, and T. Darrell, “Towards adapting deep visuomotor representations from simulated to real environments,” CoRR, vol. abs/1511.07111, 2015. [Online]. Available: http://arxiv.org/abs/1511.07111
S. Daftry, J. A. Bagnell, and M. Hebert, “Learning transferable policies for monocular reactive mav control,” in International Symposium on Experimental Robotics, 2016.
Google Scholar
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” arXiv preprintarXiv:1408.5093, 2014.
C. Finn, X. Tan, Y. Duan, T. Darrell, S. Levine, and P. Abbeel, “Deep spatial autoencoders for visuomotor learning,” in International Conference on Robotics and Automation (ICRA), 2016.
Google Scholar
P. Pastor, M. Kalakrishnan, J. Binney, J. Kelly, L. Righetti, G. Sukhatme, and S. Schaal, “Learning task error models for manipulation,” in IEEE International Conference on Robotics and Automation, 2013.
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Berkeley, USA
Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine & Trevor Darrell
Boston University, Boston, USA
Kate Saenko

Authors

Eric Tzeng
View author publications
You can also search for this author in PubMed Google Scholar
Coline Devin
View author publications
You can also search for this author in PubMed Google Scholar
Judy Hoffman
View author publications
You can also search for this author in PubMed Google Scholar
Chelsea Finn
View author publications
You can also search for this author in PubMed Google Scholar
Pieter Abbeel
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Levine
View author publications
You can also search for this author in PubMed Google Scholar
Kate Saenko
View author publications
You can also search for this author in PubMed Google Scholar
Trevor Darrell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pieter Abbeel .

Editor information

Editors and Affiliations

Industrial Engineering and Operations Research, University of California, Berkeley, CA, USA
Ken Goldberg
Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
Pieter Abbeel
Rutgers University, Piscataway, NJ, USA
Kostas Bekris
University of California, Berkeley, CA, USA
Lauren Miller

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tzeng, E. et al. (2020). Adapting Deep Visuomotor Representations with Weak Pairwise Constraints. In: Goldberg, K., Abbeel, P., Bekris, K., Miller, L. (eds) Algorithmic Foundations of Robotics XII. Springer Proceedings in Advanced Robotics, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-030-43089-4_44

Download citation

DOI: https://doi.org/10.1007/978-3-030-43089-4_44
Published: 07 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-43088-7
Online ISBN: 978-3-030-43089-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Adapting Control Policies from Simulation to Reality Using a Pairwise Loss

Toward Human-Robot Cooperation: Unsupervised Domain Adaptation for Egocentric Action Recognition

Perspectives on Deep Multimodel Robot Learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Adapting Control Policies from Simulation to Reality Using a Pairwise Loss

Toward Human-Robot Cooperation: Unsupervised Domain Adaptation for Egocentric Action Recognition

Perspectives on Deep Multimodel Robot Learning

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation