Abstract
We introduce an efficient and practical integrated system for human body model personalization with articulation. Starting with a 3D personalized model of the individual in a standard pose, the model is updated to accommodate for changes in articulations captured in a video clip with \(N\) frames. The personalized model is segmented into different parts using anthropometric control points on the silhouette boundary of the frontal projection of the 3D model. These are endpoints of segments in 2D corresponding to projections of regions of independently moving parts in 3D. These points can either be manually selected or predicted using a pre-trained convolutional neural network (NN) point model. Model evolution consists of finding a set of 3D transformations that are independently applied to parts on 3D model so that the projections of the 3D model ‘match’ those observed in the video sequence at corresponding frames. This is done by minimizing the error between the frontally projected body region points and the target region points in the image for each independent moving part. The average vertex error of our articulation recovery method yields sub-resolution recovery errors ((about 4.77 mm compared to 17.82 mm—the resolution cell of the body model). This is quite an improvement over the SMPL NN approach using the same error metric that yields 10 times the resolution cell. The virtually reconstructed articulated 3D model is fitted with a 3D garment model for the creation of virtual fitting room that allows an individual to virtually access how well the garment fits.
Similar content being viewed by others
Availability of data and material
Not Applicable.
Code availability
Not Applicable.
References
Alldieck T, Magnor M, Xu W, Theobalt C, Pons-Moll G (2018) Video based reconstruction of 3d people models. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 8387–8397 https://doi.org/10.1109/CVPR.2018.00875
Alp Güler R, Neverova N, Kokkinos I (2018) Densepose: Dense human pose estimation in the wild. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 7297–7306 https://doi.org/10.1109/CVPR.2018.00762
Amenta N, Bern M, Kamvysselis M (1998) A new Voronoi-based surface reconstruction algorithm. Proceedings of the 25th annual conference on Computer graphics and interactive techniques pp 415–421 https://doi.org/10.1145/280814.280947
Anguelov D, Koller D, Pang H-C, Srinivasan P, Thrun S (2012) Recovering articulated object models from 3D range data. https://doi.org/10.5555/1036843.1036846
Anguelov D, Srinivasan P, Pang H-C, Koller D, Thrun S, Davis J (2004) The correlated correspondence algorithm for unsupervised registration of nonrigid surfaces. Advances in neural information processing systems pp 33–40 https://doi.org/10.5555/2976040.2976045
Anguelov D, Srinivasan P, Koller D, Thrun S, Rodgers J, Davis J (2005) SCAPE: shape completion and animation of people. ACM SIGGRAPH pp 408–416 https://doi.org/10.1145/1073204.1073207
Baran I, Popović J (2007) Automatic rigging and animation of 3d characters. ACM Transactions on graphics (TOG) 26 (3):72-es. https://doi.org/10.1145/1276377.1276467
Baumberg A, Hogg D (1994) Learning flexible models from image sequences. European conference on Computer vision pp 297–308. https://doi.org/10.1007/3-540-57956-7_34
Bogo F, Kanazawa A, Lassner C, Gehler P, Romero J, Black MJ (1996) Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. European Conference on Computer Vision pp 561–578. https://doi.org/10.1007/978-3-319-46454-1_34
Cheung K, Baker S, Kanade T (2003) Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings pp I-I. https://doi.org/10.5555/1965841.1965851
Choi H, Moon G, Lee KM (2020) Pose2Mesh: Graph convolutional network for 3D human pose and mesh recovery from a 2D human pose. European Conference on Computer Vision pp 769–787
Cootes TF, Taylor CJ, Cooper DH, Graham J (1995) Active shape models-their training and application. Comput Vis Image Underst 61(1):38–59. https://doi.org/10.1006/cviu.1995.1004
Delamarre Q, Faugeras O (1999) 3D articulated models and multi-view tracking with silhouettes. Proceedings of the Seventh IEEE International Conference on Computer Vision pp 716–721. https://doi.org/10.1006/cviu.2000.0892
Freifeld O, Weiss A, Zuffi S, Black MJ (2010) Contour people: A parameterized model of 2D articulated human shape. IEEE Computer Society Conference on Computer Vision and Pattern Recognition pp 639–646 https://doi.org/10.1109/CVPR.2010.5540154
Hauswiesner S, Straka M, Reitmayr G (2011) Free viewpoint virtual try-on with commodity depth cameras. Proceedings of the 10th International Conference on Virtual Reality Continuum and Its Applications in Industry pp 23–30 https://doi.org/10.1145/2087756.2087759
Hauswiesner S, Straka M, Reitmayr G (2013) Virtual try-on through image-based rendering. IEEE Trans Visual Comput Graphics 19(9):1552–1565. https://doi.org/10.1109/TVCG.2013.67
He Z, Kan M, Zhang J, Chen X, Shan S (2017) A fully end-to-end cascaded cnn for facial landmark detection. 12th IEEE International Conference on Automatic Face & Gesture Recognition pp 200–207 https://doi.org/10.1109/FG.2017.33
Jang C, Jung K (2008) Human pose estimation using Active Shape Models. Proceedings of World Academy of Science: Engineering & Technology pp 46
Ju SX, Black MJ, Yacoob Y (1996) Cardboard people: A parameterized model of articulated image motion. Proceedings of the Second International Conference on Automatic Face and Gesture Recognition pp 38–44. https://doi.org/10.1109/AFGR.1996.557241
Kanazawa A, Black MJ, Jacobs DW, Malik J (2018) End-to-end recovery of human shape and pose. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 7122–7131 https://doi.org/10.1109/CVPR.2018.00744
Kolotouros N, Pavlakos G, Daniilidis K (2019) Convolutional mesh regression for single-image human shape reconstruction. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 4501–4510
Loop C (1987) Smooth subdivision surfaces based on triangles. Master's thesis, University of Utah
Li C, Cohen F (2020) In-home application (App) for 3D virtual garment fitting dressing room Multimedia Tools and Applications 1–22 https://doi.org/10.1007/s11042-020-09989-x
Li J, Lu G, Liu Z, Liu J, Wang X (2013) Feature curve-net-based three-dimensional garment customization. Text Res J 83(5):519–531. https://doi.org/10.1177/0040517512450758
Li Y, Sun J, Tang C-K, Shum H-Y (2004) Lazy snapping ACM Transactions on Graphics (ToG) 23(3):303–308. https://doi.org/10.1145/1186562.1015719
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition pp 3431–3440 https://doi.org/10.1109/TPAMI.2016.2572683
Loper M, Mahmood N, Romero J, Pons-Moll G, Black MJ (2015) SMPL: A skinned multi-person linear model. ACM transactions on graphics (TOG) 34(6):1–16. https://doi.org/10.1145/2816795.2818013
Magnenat-Thalmann N, Seo H, Cordier F (2004) Automatic modeling of virtual humans and body clothing. J Comput Sci Technol 19(5):575–584. https://doi.org/10.1007/BF02945583
Pavlakos G, Zhu L, Zhou X, Daniilidis K (2018) Learning to estimate 3D human pose and shape from a single color image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 459–468 https://doi.org/10.1109/CVPR.2018.00055
Pishchulin L, Wuhrer S, Helten T, Theobalt C, Schiele B (2017) Building statistical shape spaces for 3d human modeling. Pattern Recogn 67:276–286. https://doi.org/10.1016/j.patcog.2017.02.018
Shin HJ, Lee J, Shin SY, Gleicher M (2001) Computer puppetry: An importance-based approach. ACM Transactions on Graphics (TOG) 20(2):67–94. https://doi.org/10.1007/s00421-008-0955-8
Sun Y, Wang X, Tang X (2013) Deep convolutional network cascade for facial point detection. Proceedings of the IEEE conference on computer vision and pattern recognition pp 3476–3483 https://doi.org/10.1109/CVPR.2013.446
Westoby MJ, Brasington J, Glasser NF, Hambrey MJ, Reynolds JM (2012) ‘Structure-from-Motion’ photogrammetry: A low-cost, effective tool for geoscience applications. Geomorphology 179:300–314. https://doi.org/10.1016/j.geomorph.2012.08.021
Xu Z, Zhou Y, Kalogerakis E, Singh K (2019) Predicting animation skeletons for 3d articulated models via volumetric nets. International Conference on 3D Vision (3DV) pp 298–307.
Zhong Y, Xu B (2009) Three-dimensional garment dressing simulation. Text Res J 79(9):792–803. https://doi.org/10.1177/0040517508090779
Funding
Not Applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest/Competing interests
Not Applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, C., Cohen, F. Virtual reconstruction of 3D articulated human shapes applied to garment try-on in a virtual fitting room. Multimed Tools Appl 81, 11071–11085 (2022). https://doi.org/10.1007/s11042-021-11398-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11398-7