Automatic recognition of dance gesture is one important research area in computer vision with man... more Automatic recognition of dance gesture is one important research area in computer vision with many potential applications. Bali traditional dance comprises of many dance gestures that relatively unchanged over the years. Although previous studies have reported various methods for recognizing gesture, to the best of our knowledge, a method to model and classify dance gesture of Bali traditional dance is still unfound in literature. The aim of this paper is to build a robust recognizer based on linguistic motivated method to recognize dance gesture of Bali traditional dance choreography. The empiric results showed that probabilistic grammar-based classifiers that were induced using the Alergia algorithm with Symbolic Aggregate Approximation (SAX) discretization method achieved 92% of average precision in recognizing a predefined set of dance gestures. The study also showed that the most discriminative features to represent Bali traditional dance gestures are skeleton joint features of...
ABSTRACT This paper presents a linguistically motivated approach for dance gesture performance ev... more ABSTRACT This paper presents a linguistically motivated approach for dance gesture performance evaluation using skeleton tracking to robustly classify arbitrary dance gesture into one of predefined gesture classes and provide performance score in regards to the dance master’s gesture. The gesture classin this study is a set common gesture of Bali traditional dances. The dance gesture is represented as a set of skeleton feature descriptors that are extracted from images captured using Kinect depth sensor. A set of rules are learned from the training examples to capture the structure of the gesture motion using grammar inference method. The empiric results showed that elbow and foot of dance performer are the most discriminative features for representing dance gesture of Bali traditional dance. Probabilistic and deterministic grammars achieved 0.92 and 0.95 of average precision for recognizing the tested dance gestures.
This paper presents a linguistically motivated approach for dance gesture performance evaluation ... more This paper presents a linguistically motivated approach for dance gesture performance evaluation using skeleton tracking to robustly classify arbitrary dance gesture into one of predefined gesture classes and provide performance score in regards to the dance master’s gesture. The gesture class in this study is a set common gesture of Bali traditional dances. The dance gesture is represented as a set of skeleton feature descriptors that are extracted from images captured using Kinect depth sensor. A set of rules are learned from the training examples to capture the structure of the gesture motion using grammar inference method. The empiric results showed that elbow and foot of dance performer are the most discriminative features for representing dance gesture of Bali traditional dance. Probabilistic and deterministic grammars achieved 0.92 and 0.95 of average precision for recognizing the tested dance gestures.
This paper presents a simple and computationally efficient framework for 3D dance basic motion re... more This paper presents a simple and computationally efficient framework for 3D dance basic motion recognition based on syntactic pattern recognition. Intuitively, a class of basic dance motions’ center of cluster can be represented as a stochastic regular grammar (SRG) which is built using only the data from the same class during training. As for the testing, each test data from unknown classes is fed into grammar inference to compute the probability that the data is accepted by the learned grammar. Since multidimensional data which is formed by angular skeleton joints have been compacted to one-dimensional string of labels for grammar inference, the recognition process is considerably fast compared to statistical pattern classifier such as k-nearest neighbor (kNN). A single test using the learned grammar in average takes only about 5 ms compared to around 20 s using kNN whilst the overhead time to build all grammars takes only about 3.4 s. This compacting process, however, leads to information loss which is observed in slightly degraded recognition performance for low articulated motions but quite large degradation for high articulated dance motions. To overcome this, we investigate several reliable feature selection methods such as Sequential Feature Selection (SFS), Principal Component Analysis (PCA), and Heuristic Sequential Feature Selection (HSFS) compared to the use of whole features. Based on our experiment, the HSFS is the most suitable feature selection to overcome this problem.
Bali traditional dance has gain international reputation thanks to its highly articulated body-pa... more Bali traditional dance has gain international reputation thanks to its highly articulated body-part motions, fascinating eyes movement, facial expressions, and colorful costumes. Although the motions are viewed as the main aesthetic factors, automatic recognition and verification of their kinesthetic elements using computer is a challenging problem. Numerous studies have been conducted on dance recognition from its kinesthetic elements, however, to the best of our knowledge, little is known on automatic annotation, clustering, recognition, and verification of Bali traditional dance elements. This paper presents a skeleton descriptor based on dynamic time warping which enables similarity measurement between two dance sequences that may vary in time and speed. Our experiments shown that a combination of a set of time-series descriptors and exponential data time warping distance achieved the highest clustering performance than other tested combinations.
This paper presents a unified framework for recognizing and scoring dance motion using 2-layer cl... more This paper presents a unified framework for recognizing and scoring dance motion using 2-layer classifier so that computation complexity is distributed into two layers. This research examines the performance of sliding window, hidden Markov Model (HMM) and conditional random field (CRF) as the first layer classifier to segment the input video into a sequence of motion primitive label. The second layer classifier is stochastic error-correcting context-free grammar, built based on dance master knowledge, to parse the sequence of labels, builds a parse tree, and computes the accumulated dance score. The dataset for this research is captured using one Kinect camera. The training dataset is: 212 samples of 12 motion primitive samples and seven videos of Pendet dance performance. From 5-fold cross-validation, accuracy of sliding window, HMM, and CRF are 0.63, 0.79, and 0.86 respectively. This result shows that CRF achieves higher performance as a dance motion primitive recognizer than HMM as proposed by [1]. The CRF model achieves 0.88 of accuracy when motion feature is all skeleton joint angular coordinates as proposed by [2] but increases to 0.93 if the motion feature is only upper-body joint coordinates. Stochastic error-correcting context-free grammar is chosen as dance choreography model. The experiment using synthetic sequence label with cost factor ci=1 and error-sequence labels up to 50 percent shows the grammar can tolerate the input label sequence error up to 25 percent. The experiment using Pendet dance performances show that the average dance score is 79.3. The low dance score is due to several factors including: dance skill variation, unstable basic gesture repetition, high cost contributed by replacing deletion and substitution of local error by insertion operation, duration variation due the absence of timing guideline of body part motions, and limited training dataset to capture possible basic gesture variations. Index Terms-dance motion recognition and scoring.
Automatic recognition of dance gesture is one important research area in computer vision with man... more Automatic recognition of dance gesture is one important research area in computer vision with many potential applications. Bali traditional dance comprises of many dance gestures that relatively unchanged over the years. Although previous studies have reported various methods for recognizing gesture, to the best of our knowledge, a method to model and classify dance gesture of Bali traditional dance is still unfound in literature. The aim of this paper is to build a robust recognizer based on linguistic motivated method to recognize dance gesture of Bali traditional dance choreography. The empiric results showed that probabilistic grammar-based classifiers that were induced using the Alergia algorithm with Symbolic Aggregate Approximation (SAX) discretization method achieved 92% of average precision in recognizing a predefined set of dance gestures. The study also showed that the most discriminative features to represent Bali traditional dance gestures are skeleton joint features of...
ABSTRACT This paper presents a linguistically motivated approach for dance gesture performance ev... more ABSTRACT This paper presents a linguistically motivated approach for dance gesture performance evaluation using skeleton tracking to robustly classify arbitrary dance gesture into one of predefined gesture classes and provide performance score in regards to the dance master’s gesture. The gesture classin this study is a set common gesture of Bali traditional dances. The dance gesture is represented as a set of skeleton feature descriptors that are extracted from images captured using Kinect depth sensor. A set of rules are learned from the training examples to capture the structure of the gesture motion using grammar inference method. The empiric results showed that elbow and foot of dance performer are the most discriminative features for representing dance gesture of Bali traditional dance. Probabilistic and deterministic grammars achieved 0.92 and 0.95 of average precision for recognizing the tested dance gestures.
This paper presents a linguistically motivated approach for dance gesture performance evaluation ... more This paper presents a linguistically motivated approach for dance gesture performance evaluation using skeleton tracking to robustly classify arbitrary dance gesture into one of predefined gesture classes and provide performance score in regards to the dance master’s gesture. The gesture class in this study is a set common gesture of Bali traditional dances. The dance gesture is represented as a set of skeleton feature descriptors that are extracted from images captured using Kinect depth sensor. A set of rules are learned from the training examples to capture the structure of the gesture motion using grammar inference method. The empiric results showed that elbow and foot of dance performer are the most discriminative features for representing dance gesture of Bali traditional dance. Probabilistic and deterministic grammars achieved 0.92 and 0.95 of average precision for recognizing the tested dance gestures.
This paper presents a simple and computationally efficient framework for 3D dance basic motion re... more This paper presents a simple and computationally efficient framework for 3D dance basic motion recognition based on syntactic pattern recognition. Intuitively, a class of basic dance motions’ center of cluster can be represented as a stochastic regular grammar (SRG) which is built using only the data from the same class during training. As for the testing, each test data from unknown classes is fed into grammar inference to compute the probability that the data is accepted by the learned grammar. Since multidimensional data which is formed by angular skeleton joints have been compacted to one-dimensional string of labels for grammar inference, the recognition process is considerably fast compared to statistical pattern classifier such as k-nearest neighbor (kNN). A single test using the learned grammar in average takes only about 5 ms compared to around 20 s using kNN whilst the overhead time to build all grammars takes only about 3.4 s. This compacting process, however, leads to information loss which is observed in slightly degraded recognition performance for low articulated motions but quite large degradation for high articulated dance motions. To overcome this, we investigate several reliable feature selection methods such as Sequential Feature Selection (SFS), Principal Component Analysis (PCA), and Heuristic Sequential Feature Selection (HSFS) compared to the use of whole features. Based on our experiment, the HSFS is the most suitable feature selection to overcome this problem.
Bali traditional dance has gain international reputation thanks to its highly articulated body-pa... more Bali traditional dance has gain international reputation thanks to its highly articulated body-part motions, fascinating eyes movement, facial expressions, and colorful costumes. Although the motions are viewed as the main aesthetic factors, automatic recognition and verification of their kinesthetic elements using computer is a challenging problem. Numerous studies have been conducted on dance recognition from its kinesthetic elements, however, to the best of our knowledge, little is known on automatic annotation, clustering, recognition, and verification of Bali traditional dance elements. This paper presents a skeleton descriptor based on dynamic time warping which enables similarity measurement between two dance sequences that may vary in time and speed. Our experiments shown that a combination of a set of time-series descriptors and exponential data time warping distance achieved the highest clustering performance than other tested combinations.
This paper presents a unified framework for recognizing and scoring dance motion using 2-layer cl... more This paper presents a unified framework for recognizing and scoring dance motion using 2-layer classifier so that computation complexity is distributed into two layers. This research examines the performance of sliding window, hidden Markov Model (HMM) and conditional random field (CRF) as the first layer classifier to segment the input video into a sequence of motion primitive label. The second layer classifier is stochastic error-correcting context-free grammar, built based on dance master knowledge, to parse the sequence of labels, builds a parse tree, and computes the accumulated dance score. The dataset for this research is captured using one Kinect camera. The training dataset is: 212 samples of 12 motion primitive samples and seven videos of Pendet dance performance. From 5-fold cross-validation, accuracy of sliding window, HMM, and CRF are 0.63, 0.79, and 0.86 respectively. This result shows that CRF achieves higher performance as a dance motion primitive recognizer than HMM as proposed by [1]. The CRF model achieves 0.88 of accuracy when motion feature is all skeleton joint angular coordinates as proposed by [2] but increases to 0.93 if the motion feature is only upper-body joint coordinates. Stochastic error-correcting context-free grammar is chosen as dance choreography model. The experiment using synthetic sequence label with cost factor ci=1 and error-sequence labels up to 50 percent shows the grammar can tolerate the input label sequence error up to 25 percent. The experiment using Pendet dance performances show that the average dance score is 79.3. The low dance score is due to several factors including: dance skill variation, unstable basic gesture repetition, high cost contributed by replacing deletion and substitution of local error by insertion operation, duration variation due the absence of timing guideline of body part motions, and limited training dataset to capture possible basic gesture variations. Index Terms-dance motion recognition and scoring.
Uploads
Papers by Yaya Heryadi