A multiscale video content organization scheme is proposed for fast browsing and efficient transmission of video sequences. The scheme leads to construction of a five-layer tree structure. At layer 0 the root-node is located, connected to all nodes of layer 1, each corresponding to a class of shots. Then every class of shot is expanded at layer 2. The nodes of this layer represent shots. At the next resolution level (layer 3) nodes represent key-frames of shots. Finally at layer 4 the full resolution level is reached, where nodes correspond to frames of the sequence. Each node contains a viewing element and we focus on the extraction of these elements for layers 1, 2 and 3. Viewing elements of layers 1 and 3 are optimally extracted by minimizing a cross correlation criterion. Additionally viewing elements of layer 2 are selected according to a correlation measure between the mean vector of a shot and each of the frames within this shot. The resulting tree-structure enables a user to quickly and easily detect content of interest, by selecting the viewing element of his/her liking. Experimental results on real-life video sequences indicate the promising performance of the proposed scheme
Anastasios Doulamis hasn't uploaded this paper.
Let Anastasios know you want this paper to be uploaded.