Abstract
Recent advances of deep learning technology enable one to train complex input-output mappings, provided that a high quality training set is available. In this paper, we show how to extend an existing dataset of depth maps of hand annotated with the corresponding 3D hand poses by fitting a 3D hand model to smart glove-based annotations and generating new hand views. We make available our code and the generated data. Based on the present procedure and our previous results, we suggest a pipeline for creating high quality data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bengio, Y., Goodfellow, I.J., Courville, A.: Deep Learning. MIT Press, Cambridge (2015, in preparation). http://www.iro.umontreal.ca/~bengioy/dlbook
Bray, M., Koller-Meier, E., Van Gool, L.: Smart particle filtering for 3D hand tracking. In: Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 675–680. IEEE (2004)
Cho, M., Sun, J., Duchenne, O., Ponce, J.: Finding matches in a haystack: a max-pooling strategy for graph matching in the presence of outliers. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2083–2090 (2014)
Cottrell, G.W., Munro, P., Zipser, D.: Learning internal representations from gray-scale images: an example of extensional programming. In: Ninth Annual Conference of the Cognitive Science Society, pp. 462–473 (1987)
Cuturi, M.: Fast global alignment kernels. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 929–936 (2011)
Dosovitskiy, A., Tobias Springenberg, J., Brox, T.: Learning to generate chairs with convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1538–1546 (2015)
Fu, J., Wu, Y., Mei, T., Wang, J., Lu, H., Rui, Y.: Relaxing from vocabulary: robust weakly-supervised deep learning for vocabulary-free image tagging. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1985–1993 (2015)
Han, F., Reily, B., Hoff, W., Zhang, H.: Space-time representation of people based on 3D skeletal data: a review. arXiv preprint arXiv:1601.01006 (2016)
Jeni, L.A., Lőrincz, A., Szabó, Z., Cohn, J.F., Kanade, T.: Spatio-temporal event classification using time-series kernel based structured sparsity. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part IV. LNCS, vol. 8692, pp. 135–150. Springer, Heidelberg (2014)
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
Kohonen, T., Lehtio, P., Oja, E., Kortekangas, A., Makisara, K.: Demonstration of pattern processing properties of the optimal associative mappings. In: Proceedings of the International Conference on Cybernetics and Society (1977)
Oberweger, M., Riegler, G., Wohlhart, P., Lepetit, V.: Efficiently creating 3D training data for fine hand pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016, accepted)
Oberweger, M., Wohlhart, P., Lepetit, V.: Hands deep in deep learning for hand pose estimation. In: Proceedings Computer Vision Winter Workshop (CVWW) (2015)
Oberweger, M., Wohlhart, P., Lepetit, V.: Training a feedback loop for hand pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3316–3324 (2015)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Tracking the articulated motion of two strongly interacting hands. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1862–1869. IEEE (2012)
Palotai, Z., Lang, M., Sarkany, A., Toser, Z., Sonntag, D., Toyama, T., Lorincz, A.: Labelmovie: semi-supervised machine annotation tool with quality assurance and crowd-sourcing options for videos. In: 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–4. IEEE (2014)
Rasmus, A., Berglund, M., Honkala, M., Valpola, H., Raiko, T.: Semi-supervised learning with ladder networks. In: Advances in Neural Information Processing Systems. pp. 3532–3540 (2015)
Riegler, G., Ferstl, D., Rüther, M., Bischof, H.: A framework for articulated hand pose estimation and evaluation. In: Paulsen, R.R., Pedersen, K.S. (eds.) SCIA 2015. LNCS, vol. 9127, pp. 41–52. Springer, Heidelberg (2015)
Rogez, G., Supancic, J.S., Ramanan, D.: Understanding everyday hands in action from RGB-D images. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3889–3897 (2015)
Šaric, M.: Libhand: a library for hand articulation. Version 0.9 (2011)
Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)
Sharp, T., Keskin, C., Robertson, D., Taylor, J., Shotton, J., Kim, D., Rhemann, C., Leichter, I., Vinnikov, A., Wei, Y., et al.: Accurate, robust, and flexible real-time hand tracking. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pp. 3633–3642. ACM (2015)
Shin, J.H., Kim, M.Y., Lee, J.Y., Jeon, Y.J., Kim, S., Lee, S., Seo, B., Choi, Y.: Effects of virtual reality-based rehabilitation on distal upper extremity function and health-related quality of life: a single-blinded, randomized controlled trial. J. Neuroeng. Rehabil. 13(1), 1 (2016)
Sukhbaatar, S., Bruna, J., Paluri, M., Bourdev, L., Fergus, R.: Training convolutional networks with noisy labels. arXiv preprint arXiv:1406.2080 (2014)
Supancic, J.S., Rogez, G., Yang, Y., Shotton, J., Ramanan, D.: Depth-based hand pose estimation: data, methods, and challenges. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1868–1876 (2015)
Tang, D., Chang, H., Tejani, A., Kim, T.K.: Latent regression forest: structured estimation of 3D articulated hand posture. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3786–3793 (2014)
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. (TOG) 33(5), 169 (2014)
Xu, C., Cheng, L.: Efficient hand pose estimation from a single depth image. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3456–3462 (2013)
Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 787–796 (2015)
Acknowledgments
This work was supported by the EIT Digital grant (Grant No. 16257).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Bellon, R. et al. (2016). Model Based Augmentation and Testing of an Annotated Hand Pose Dataset. In: Friedrich, G., Helmert, M., Wotawa, F. (eds) KI 2016: Advances in Artificial Intelligence. KI 2016. Lecture Notes in Computer Science(), vol 9904. Springer, Cham. https://doi.org/10.1007/978-3-319-46073-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-46073-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46072-7
Online ISBN: 978-3-319-46073-4
eBook Packages: Computer ScienceComputer Science (R0)