Abstract
Some 3D computer vision techniques such as structure from motion (SFM) and augmented reality (AR) depend on a specific perspective-n-point (PnP) algorithm to estimate the absolute camera pose. However, existing PnP algorithms are difficult to achieve a good balance between accuracy and efficiency, and most of them do not make full use of the internal camera information such as focal length. In order to attack these drawbacks, we propose a fast and robust PnP (FRPnP) method to calculate the absolute camera pose for 3D compute vision. In the proposed FRPnP method, we firstly formulate the PnP problem as the optimization problem in the null space that can avoid the effects of the depth of each 3D point. Secondly, we can easily get the solution by the direct manner using singular value decomposition. Finally, the accurate information of camera pose can be obtained by optimization strategy. We explore four ways to evaluate the proposed FRPnP algorithm with synthetic dataset, real images, and apply it in the AR and SFM system. Experimental results show that the proposed FRPnP method can obtain the best balance between computational cost and precision, and clearly outperforms the state-of-the-art PnP methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Change history
22 May 2024
This article has been retracted. Please see the Retraction Notice for more detail: https://doi.org/10.1007/s00521-024-10032-5
References
Nakano G (2016) A versatile approach for solving PnP, PnPf, and PnPfr problems. In: Computer Vision–ECCV 2016: 14th European Conference, pp. 338–352
Urban S, Leitloff J, Hinz S (2016) MLPNP—a real-time maximum likelihood solution to the perspective-N-point problem. ISPRS Ann Photogramm Remote Sens Spat Inf Sci 3(3):131–138
Haner S, Astrom K (2015) Absolute pose for cameras under flat refractive interfaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1428–1436
Lv Z, Halawani A, Feng S, ur Réhman S, Li H (2015) Touch-less interactive augmented reality game on vision-based wearable device. Pers Ubiquit Comput 19(3):551–567
Li Z, Wang Y, Guo J, Cheong L-F (2013) Diminished reality using appearance and 3D geometry of internet photo collections. IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp 11–19
Khan MSL, Réhman SU, Zhihan LV, Li H (2013) Head orientation modeling: geometric head pose estimation using monocular camera. The Ieee/iiae International Conference on Intelligent Systems and Image Processing, pp 149–153
Locher A, Perdoch M, Van Gool L (2016) Progressive prioritized multi-view stereo. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3244–3252
Bailer C, Finckh M, Lensch HP (2012) Scale robust multi view stereo. In: European Conference on Computer Vision (ECCV), pp 398–411
Lv Z, Halawani A, Feng S, Li H, Réhman SU (2014) Multimodal hand and foot gesture interaction for handheld devices. ACM Trans Multimed Comput Commun Appl 11(1):1–19
Cadena C, Carlone L, Carrillo H, Latif Y, Scaramuzza D, Neira J, et al (2016) Simultaneous localization and mapping: present, future, and the robust-perception age. arXiv preprint arXiv:160605830
Kong C, Lucey S (2016) Prior-less compressible structure from motion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4123–4131
Schönberger JL, Frahm J-M (2016) Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4104–4113
Crandall D, Owens A, Snavely N, Huttenlocher D (2011) Discrete-continuous optimization for large-scale structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3001–3008
Wu C (2013) Towards linear-time incremental structure from motion. In: Proceedings of the IEEE International Conference on 3D Vision (3DV), pp 127–134
Dong Z, Zhang G, Jia J, Bao H (2009) Keyframe-based real-time camera tracking. Computer Vision. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 1538–1545
Shan Y, Liu Z, Zhang Z (2001) Model-based bundle adjustment with application to face modeling. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 644–651
Jourabloo A. A survey of different 3D face reconstruction methods
Thomas D, Taniguchi R-I (2016) Augmented blendshapes for real-time simultaneous 3D head modeling and facial motion capture. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3299–3308
Sakurada K, Okatani T, Deguchi K (2013) Detecting changes in 3D structure of a scene from multi-view images captured by a vehicle-mounted camera. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 137–144
Arrigoni F, Rossi B, Malapelle F, Fragneto P, Fusiello A (2014) Robust global motion estimation with matrix completion. Int Arch Photogramm Remote Sens Spat Inf Sci 40(5):63–70
Wang G, Chen X, Hu S (2014) Geometry-aware image completion via multiple examples. Eurographics Association, pp 97–100
Colbert M, Bouguet J-Y, Beis J, Childs S, Filip D, Vincent L (2012) Building indoor multi-panorama experiences at scale. ACM Siggraph Talks, pp 101–102
Zeisl B, Sattler T, Pollefeys M (2015) Camera pose voting for large-scale image-based localization. In: Proceedings of the IEEE International Conference on Computer Vision (CVPR), pp 2704–2712
Amorim N, Rocha JG (2016) State of art survey on: large scale image location recognition. International Conference on Computational Science and Its Applications, pp 375–385
Arandjelović R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (CVPR), pp 5297–5307
Daniilidis K (1998) Hand-eye calibration using dual quaternions. Int J Robot Res 18(3):286–298
Song S, Chandraker M (2014) Robust scale estimation in real-time monocular SFM for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1566–1573
Lv Z, Chirivella J, Gagliardo P (2016) Bigdata oriented multimedia mobile health applications. J Med Syst 40(5):1–10
Zhang X, Han Y, Hao D, Lv Z (2016) ARGIS-based outdoor underground pipeline information system. J Vis Commun Image Represent (VCIP) 40:779–790
Ansar A, Daniilidis K (2003) Linear pose estimation from points or lines. IEEE Trans Pattern Anal Mach Intell 25(5):578–589
Hartley R, Zisserman A (2003) Multiple view geometry in computer vision: Cambridge university press
Kneip L, Scaramuzza D, Siegwart R (2011) A novel parameterization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 2969–2976
Josephson K, Byrod M (2009) Pose estimation with radial distortion and unknown focal length. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp 2419–2426
Hesch JA, Roumeliotis SI (2011) A direct least-squares (DLS) method for PnP. In: Internal Conference on Computer Vision (ICCV), pp 383–390
Sweeney C, Fragoso V, Höllerer T, Turk M (2014) gdls: a scalable solution to the generalized pose and scale problem. European Conference on Computer Vision (ECCV), pp 16–31
Li S, Xu C, Xie M (2012) A robust O (n) solution to the perspective-n-point problem. IEEE Trans Pattern Anal Mach Intell 34(7):1444–1450
Zheng Y, Kuang Y, Sugimoto S, Astrom K, Okutomi M (2013) Revisiting the PnP problem: a fast, general and optimal solution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2344–2351
Zheng Y, Sugimoto S, Okutomi M (2013) AsPnP: an accurate and scalable solution to the perspective-n-point problem. IEICE Trans Inf Syst 96(7):1525–1535
Lu C-P, Hager GD, Mjolsness E (2000) Fast and globally convergent pose estimation from video images. IEEE Trans Pattern Anal Mach Intell 22(6):610–622
Garro V, Crosilla F, Fusiello A (2012) Solving the PnP problem with anisotropic orthogonal Procrustes analysis. 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp 262–269
Ferraz L, Binefa X, Moreno-Noguer F (2014) Very fast solution to the PnP problem with algebraic outlier rejection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 501–508
Penate-Sanchez A, Andrade-Cetto J, Moreno-Noguer F (2013) Exhaustive linearization for robust camera pose and focal length estimation. IEEE Trans Pattern Anal Mach Intell 35(10):2387–2400
Ferraz L, Binefa X, Moreno-Noguer F (2014) Leveraging feature uncertainty in the PnP problem. British Machine Vision Conference (BMVC), pp 10–23
Schweighofer G, Pinz A (2008) Globally optimal O (n) solution to the PnP problem for general camera models. British Machine Vision Conference (BMVC), pp 1–10
Kahl F, Henrion D (2007) Globally optimal estimates for geometric reconstruction problems. Int J Comput Vis 74(1):3–15
Moreno-Noguer F, Lepetit V, Fua P (2007) Accurate non-iterative o (n) solution to the PnP problem. In: internal conference on computer vision (ICCV), pp 1–8
Sweeney C, Flynn J, Nuernberger B, Turk M, Hollerer T (2015) Efficient computation of absolute pose for gravity-aware augmented reality. In: IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp 19–24
Garrido-Jurado S, Muñoz-Salinas R, Madrid-Cuevas FJ, Marín-Jiménez MJ (2014) Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn 47(6):2280–2292
Garrido-Jurado S, Muñoz-Salinas R, Madrid-Cuevas FJ, Medina-Carnicer R (2016) Generation of fiducial marker dictionaries using mixed integer linear programming. Pattern Recogn 51:481–491
Höllerer T, Feiner S, Terauchi T, Rashid G, Hallaway D (1999) Exploring MARS: developing indoor and outdoor user interfaces to a mobile augmented reality system. Comput Graph 23(6):779–785
Wojciechowski R, Cellary W (2013) Evaluation of learners’ attitude toward learning in ARIES augmented reality environments. Comput Educ 68:570–585
Di Serio Á, Ibáñez MB, Kloos CD (2013) Impact of an augmented reality system on students’ motivation for a visual art course. Comput Educ 68:586–596
Ong SK, Nee AYC (2013) Virtual and augmented reality applications in manufacturing. Springer, New York
Dunleavy M, Dede C (2014) Augmented reality teaching and learning. In: Handbook of research on educational communications and technology, pp 735–745
Snavely N, Seitz SM, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM transactions on graphics (TOG); pp 835–846
Zach C. ETH-V3D structure-and-motion software.© 2010–2011. ETH Zurich. 2010
Wu C. SiftGPU: a GPU implementation of scale invariant feature transform. URL http://cs.unc edu/~ ccwu/siftgpu. 2011
Zhang G, Liu H, Dong Z, Jia J, Wong T-T, Bao H (2015) ENFT: efficient non-consecutive feature tracking for robust structure-from-motion. arXiv preprint arXiv:151008012. 2015.
Ni K, Dellaert F (2012) HyperSfM. 2012 second international conference on 3D imaging, modeling, processing, visualization and transmission (3DIMPVT), pp 144–151
Xiao J, Owens A, Torralba A, (2013) SUN3D: a database of big spaces reconstructed using SFM and object labels. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 1625–1632
Moulon P, Monasse P, Marlet R (2013) Global fusion of relative motions for robust, accurate and scalable structure from motion. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 3248–3255
Alcantarilla PF, Bartoli A, Davison AJ (2012) KAZE features. In: European Conference on Computer Vision (ECCV), pp 214–227
Sweeney C, Sattler T, Hollerer T, Turk M, Pollefeys M (2015) Optimizing the viewing graph for structure-from-motion. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 801–809
Wilson K, Snavely N (2014) Robust global translations with 1DSFM. In: European Conference on Computer Vision (ECCV), pp 61–75
Bao SY, Savarese S (2011) Semantic structure from motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2025–2032
Wang TY, Kohli P, Mitra NJ (2015) Dynamic SFM: detecting scene changes from image pairs. Comput Graphics Forum 34(5):177–189
Zheng E, Wu C (2015) Structure from motion using structure-less resection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2075–2083
Haralick BM, Lee C-N, Ottenberg K, Nölle M (1994) Review and analysis of solutions of the three point perspective pose estimation problem. Int J Comput Vis 13(3):331–356
Horaud R, Conio B, Leboulleux O, Lacolle B (1989) An analytic solution for the perspective 4-point problem. Elsevier, B. Comput Vision Graph Image Process 47:33–44
Wu Y, Hu Z (2006) PnP problem revisited. J Math Imaging Vision 24(1):131–141
Hu ZY, Wu FC (2002) A note on the number of solutions of the noncoplanar P4P problem. IEEE Trans Pattern Anal Mach Intell 24(4):550–555
Zhang L, Xu C, Lee K-M, Koch R (2012) Robust and efficient pose estimation from line correspondences. In: Asia Conference on Computer Vision (ACCV), pp 217–230
Ventura J, Arth C, Reitmayr G, Schmalstieg D (2014) A minimal solution to the generalized pose-and-scale problem. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 422–429
Kneip L, Li H, Seo Y (2014) UPnP: an optimal o (n) solution to the absolute pose problem with universal applicability. In: European Conference on Computer Vision (ECCV), pp 127–142
Bushnevskiy A, Sorgi L, Rosenhahn B (2016) Multicamera calibration from visible and mirrored epipoles. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3373–3381
Förstner W (2010) Minimal representations for uncertainty and estimation in projective spaces. Asian Conference on Computer Vision (ACCV), pp 619–632
Boyd S, Vandenberghe L (2004) Convex optimization: Cambridge University press
Wu C, Agarwal S, Curless B, Seitz SM (2011) Multicore bundle adjustment. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3057–3064
Huber PJ (1973) Robust regression: asymptotics, conjectures and Monte Carlo. Ann Stat 1:799–821
Levenberg K (1994) A method for the solution of certain non–linear problems in least squares. J Heart Lung Tansplant 2(4):436–438
Xi Y, Xia J, Chan R (2014) A fast randomized eigensolver with structured LDL factorization update. SIAM J Matrix Anal Appl 35(3):974–996
Cheng J, Leng C, Wu J, Cui H, Lu H (2014) Fast and accurate image matching with cascade hashing for 3D reconstruction. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8
Nistér D (2004) An efficient solution to the five-point relative pose problem. IEEE Trans Pattern Anal Mach Intell 26(6):756–770
Yang X, Cheng K-T (2012) LDB: an ultra-fast feature for scalable augmented reality on mobile devices. 2012 I.E. International Symposium on Mixed and Augmented Reality (ISMAR), pp 49–57
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Li W, Cosker D, Lv Z, Brown M (2016) Nonrigid optical flow ground truth for real-world scenes with time-varying shading effects. IEEE Robot Autom Lett 2(1):231–238
Heinly J, Dunn E, Frahm JM (2014) Correcting for duplicate scene structure in sparse 3D reconstruction. In: European Conference on Computer Vision (ECCV), pp 780–795
Chandrasekhar VR, Chen DM, Tsai SS, Cheung NM, Chen H, Takacs G (2011) The Stanford mobile visual search data set. ACM Sigmm Conference on Multimedia Systems, pp 117–122
Acknowledgements
This work is supported by the grants of the National Science Foundation of China (Nos. 61370167, 61673157, 61402018, and 61305093), the National Key Research and Development Plan under Grant No. 2016YFC0800100, and also supported by the grants of the Natural Science Foundation of Anhui Province (Nos. KJ2014ZD27, JZ2015AKZR0664, and 1604e0302001). The authors would like to thank anonymous reviewers for their helpful and constructive comments that greatly improved the paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
About this article
Cite this article
Cao, M.W., Jia, W., Zhao, Y. et al. RETRACTED ARTICLE: Fast and robust absolute camera pose estimation with known focal length. Neural Comput & Applic 29, 1383–1398 (2018). https://doi.org/10.1007/s00521-017-3032-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-017-3032-6