ABSTRACT This paper provides an overview of the tasks submitted to TRECVID 2013 by ITI-CERTH. ITI... more ABSTRACT This paper provides an overview of the tasks submitted to TRECVID 2013 by ITI-CERTH. ITI- CERTH participated in the Semantic Indexing (SIN), the Event Detection in Internet Multimedia (MED), the Multimedia Event Recounting (MER) and the Instance Search (INS) tasks. In the SIN task, techniques are developed, which combine new video representations (video tomographs) with existing well-performing descriptors such as SIFT, Bag-of-Words for shot representation, ensemble construction techniques and a multi-label learning method for score re�nement. In the MED task, an e�cient method that uses only static visual features as well as limited audio information is evaluated. In the MER sub-task of MED a discriminant analysis-based feature selection method is combined with a model vector approach for selecting the key semantic entities depicted in the video that best describe the detected event. Finally, the INS task is performed by employing VERGE, which is an in- teractive retrieval application combining retrieval functionalities in various modalities, used previously for supporting the Known Item Search (KIS) task.
This paper provides an overview of the tasks submitted to TRECVID 2011 by ITI-CERTH. ITI-CERTH pa... more This paper provides an overview of the tasks submitted to TRECVID 2011 by ITI-CERTH. ITI-CERTH participated in the Known-item search (KIS) as well as in the Semantic Indexing (SIN) and the Event Detection in Internet Multimedia (MED) tasks. In the SIN task, techniques are developed, which combine motion information with existing well-performing descriptors such as SURF, Random Forests and Bag-of-Words for shot representation. In the MED task, the trained concept detectors of the SIN task are used to represent video sources with model vector sequences, then a dimensionality reduction method is used to derive a discriminant subspace for recognizing events, and, finally, SVMbased event classifiers are used to detect the underlying video events. The KIS search task is performed by employing VERGE, which is an interactive retrieval application combining retrieval functionalities in various modalities and exploiting implicit user feedback. 1
In this paper a multi-modal method for human identification that exploits the discriminant featur... more In this paper a multi-modal method for human identification that exploits the discriminant features derived from several movement types performed from the same human is proposed. Utilizing a fuzzy vector quantization (FVQ) and linear discriminant analysis (LDA) based algorithm, an unknown movement is first classified, and, then, the person performing the movement is recognized from a movement specific person recognition
IEEE transactions on neural networks and learning systems, 2013
In this paper, a theoretical link between mixture subclass discriminant analysis (MSDA) and a res... more In this paper, a theoretical link between mixture subclass discriminant analysis (MSDA) and a restricted Gaussian model is first presented. Then, two further discriminant analysis (DA) methods, i.e., fractional step MSDA (FSMSDA) and kernel MSDA (KMSDA) are proposed. Linking MSDA to an appropriate Gaussian model allows the derivation of a new DA method under the expectation maximization (EM) framework (EM-MSDA), which simultaneously derives the discriminant subspace and the maximum likelihood estimates. The two other proposed methods generalize MSDA in order to solve problems inherited from conventional DA. FSMSDA solves the subclass separation problem, that is, the situation in which the dimensionality of the discriminant subspace is strictly smaller than the rank of the inter-between-subclass scatter matrix. This is done by an appropriate weighting scheme and the utilization of an iterative algorithm for preserving useful discriminant directions. On the other hand, KMSDA uses the ...
2009 16th IEEE International Conference on Image Processing (ICIP), 2009
... From this database we used low resolution videos (180 × 144 pixels resolution at 25 fps), dep... more ... From this database we used low resolution videos (180 × 144 pixels resolution at 25 fps), depicting nine persons, namely, Daria (dar), Denis (den), Ido (ido), Ira (ira), Lena (len), Lyova (lyo), Moshe (mos), Shahar (sha), performing seven movements, ie, walk (wk), run (rn), skip ...
ABSTRACT This paper provides an overview of the tasks submitted to TRECVID 2013 by ITI-CERTH. ITI... more ABSTRACT This paper provides an overview of the tasks submitted to TRECVID 2013 by ITI-CERTH. ITI- CERTH participated in the Semantic Indexing (SIN), the Event Detection in Internet Multimedia (MED), the Multimedia Event Recounting (MER) and the Instance Search (INS) tasks. In the SIN task, techniques are developed, which combine new video representations (video tomographs) with existing well-performing descriptors such as SIFT, Bag-of-Words for shot representation, ensemble construction techniques and a multi-label learning method for score re�nement. In the MED task, an e�cient method that uses only static visual features as well as limited audio information is evaluated. In the MER sub-task of MED a discriminant analysis-based feature selection method is combined with a model vector approach for selecting the key semantic entities depicted in the video that best describe the detected event. Finally, the INS task is performed by employing VERGE, which is an in- teractive retrieval application combining retrieval functionalities in various modalities, used previously for supporting the Known Item Search (KIS) task.
This paper provides an overview of the tasks submitted to TRECVID 2011 by ITI-CERTH. ITI-CERTH pa... more This paper provides an overview of the tasks submitted to TRECVID 2011 by ITI-CERTH. ITI-CERTH participated in the Known-item search (KIS) as well as in the Semantic Indexing (SIN) and the Event Detection in Internet Multimedia (MED) tasks. In the SIN task, techniques are developed, which combine motion information with existing well-performing descriptors such as SURF, Random Forests and Bag-of-Words for shot representation. In the MED task, the trained concept detectors of the SIN task are used to represent video sources with model vector sequences, then a dimensionality reduction method is used to derive a discriminant subspace for recognizing events, and, finally, SVMbased event classifiers are used to detect the underlying video events. The KIS search task is performed by employing VERGE, which is an interactive retrieval application combining retrieval functionalities in various modalities and exploiting implicit user feedback. 1
In this paper a multi-modal method for human identification that exploits the discriminant featur... more In this paper a multi-modal method for human identification that exploits the discriminant features derived from several movement types performed from the same human is proposed. Utilizing a fuzzy vector quantization (FVQ) and linear discriminant analysis (LDA) based algorithm, an unknown movement is first classified, and, then, the person performing the movement is recognized from a movement specific person recognition
IEEE transactions on neural networks and learning systems, 2013
In this paper, a theoretical link between mixture subclass discriminant analysis (MSDA) and a res... more In this paper, a theoretical link between mixture subclass discriminant analysis (MSDA) and a restricted Gaussian model is first presented. Then, two further discriminant analysis (DA) methods, i.e., fractional step MSDA (FSMSDA) and kernel MSDA (KMSDA) are proposed. Linking MSDA to an appropriate Gaussian model allows the derivation of a new DA method under the expectation maximization (EM) framework (EM-MSDA), which simultaneously derives the discriminant subspace and the maximum likelihood estimates. The two other proposed methods generalize MSDA in order to solve problems inherited from conventional DA. FSMSDA solves the subclass separation problem, that is, the situation in which the dimensionality of the discriminant subspace is strictly smaller than the rank of the inter-between-subclass scatter matrix. This is done by an appropriate weighting scheme and the utilization of an iterative algorithm for preserving useful discriminant directions. On the other hand, KMSDA uses the ...
2009 16th IEEE International Conference on Image Processing (ICIP), 2009
... From this database we used low resolution videos (180 × 144 pixels resolution at 25 fps), dep... more ... From this database we used low resolution videos (180 × 144 pixels resolution at 25 fps), depicting nine persons, namely, Daria (dar), Denis (den), Ido (ido), Ira (ira), Lena (len), Lyova (lyo), Moshe (mos), Shahar (sha), performing seven movements, ie, walk (wk), run (rn), skip ...
Uploads
Papers by Nikolaos Gkalelis