Department of Signal Theory and Communications: Ph.D. Dissertation
Department of Signal Theory and Communications: Ph.D. Dissertation
Department of Signal Theory and Communications: Ph.D. Dissertation
Ph.D. Dissertation
This work discusses the usefulness of hierarchical region based representations for image
and video processing. Region based representations offer a way to perform a first level of
abstraction and reduce the number of elements to process with respect to the classical pixel
based representation. In this work the two representations that have demonstrated to be
useful for region based processing are reviewed, namely region adjacency graphs and trees,
and it is discussed why tree based representations are better suited for our purpose. In fact,
trees allow representing the image in a hierarchical way and efficient and complex processing
techniques can be applied on it. Two major issues are discussed in this work: how the
hierarchical representation may be created from a given image and how the tree may be
manipulated or processed.
Two tree based representations have been developed: the Max-Tree, and the Binary Par-
tition Tree. The Max-Tree structures in a compact way the connected components that arise
from all possible level sets from a gray-level image. It is suitable for the implementation of
anti-extensive connected operators, ranging from classical ones (for instance, area filter) to
new ones (such as the motion filter developed in this work). The Binary Partition Tree struc-
tures the set of regions that are obtained during the execution of a region merging algorithm.
Developed to overcome some of the drawbacks imposed by the Max-Tree – in particular the
lack of flexibility of the tree creation and the self-duality of the tree representation –, it has
demonstrated to be a representation useful for a rather large range of applications, as it is
shown in this work.
Processing strategies are focused on pruning techniques. Pruning techniques remove some
of the branches of the tree based on an analysis algorithm applied on the nodes of the tree.
Pruning techniques applied on the Max-Tree lead to anti-extensive operators, whereas self-
dual operators are obtained on the Binary Partition Tree, if the tree is created in a self-dual
manner. The pruning techniques that have been developed in this work are directed to the
following applications: filtering, segmentation and content based image retrieval.
The filtering (in the context of connected operators) and segmentation applications are
based on the same principle: the nodes of the tree are analyzed according to a fixed criterion,
and the decision to remove or preserve a node usually relies on a threshold applied on the
iv
former measured criterion. Pruning is then performed according to the previous decision. As
a result, the image associated to the pruned tree represents a filtered or segmented version of
the original image according to the selected criterion. Some of the criteria that are discussed in
this work are based, for instance, on area, motion, marker & propagation or a rate-distortion
strategy. The problem of the lack of robustness of classical decision approaches of non-
increasing criteria is discussed and solved by means of an optimization strategy based on the
Viterbi algorithm.
Content based image retrieval is the third application we have focused on in this work.
Hierarchical region based representations are particularly well suited for this purpose since
they allow to represent the image at different scales of resolution, and thus the regions of the
image can be described at different scales of resolution. In this work we focus on an image
retrieval system which supports low level queries based on visual descriptors and spatial
relationships. For that purpose, region descriptors are attached to the nodes of the tree.
Two types of queries are discussed: single region query, in which the query is made up of
one region and, multiple region query, in which the query is made up of a set of regions. In
the former visual descriptors are used to perform the retrieval whereas visual descriptors and
spatial relationships are used in the latter case. Moreover, a relevance feedback approach is
presented to avoid the need of manually setting the weights associated to each descriptor.
An important aspect that has been taken into account throughout this work is the efficient
implementation of the algorithms that have been developed for both creation and processing
of the tree. In the case of the tree creation, efficiency has been obtained mainly due to the use
of hierarchical queues, whereas in the processing step analysis algorithms based on recursive
strategies are used to get efficient algorithms.
Resumen
Las estrategias de procesado se basan en técnicas de poda. Las técnicas de poda elimi-
nan algunas ramas del árbol basándose en un algoritmo de análisis aplicado a los nodos del
árbol. Las técnicas de poda aplicadas al Árbol de Máximos permiten obtener operadores anti-
extensivos, mientras que para el caso del Árbol de Particiones Binario se obtienen operadores
auto-duales si éste ha sido creado de forma auto-dual. Las técnicas de poda desarrolladas
en este trabajo están dirigidas hacia las siguiente aplicaciones: filtrado, segmentación y recu-
peración de datos basada en el contenido.
vi
El trabajo presentado en este documento se incribe dentro del marco del proyecto Consulta-
tion Thématique Informelle (CTI96-ME22, Ocubre 1996 a Diciembre 1999): Caracterización
automática del contenido semántico de secuencias de vı́deo aplicado a la indexación y la
búsqueda de información. Ha sido financiado por France Telecom CNET (actualmente France
Telecom R&D) y en el han participado el Centro de Morfologı́a Matemática (Fontainebleau),
la Universidad Politénica de Cataluña (Barcelona) e INRIA (Rocquencourt).
Este documento es fruto de la investigación realizada durante estos últimos cinco años.
Ha sido un largo recorrido en el que he podido formarme tanto profesionalmente como per-
sonalmente. Quisiera por ello agradecer aqui a toda aquella gente que ha contribuido, directa
o indirectamente, a la elaboración de este trabajo. En particular quisiera agradecer a las
siguientes personas su implicación en esta tesis.
A Philippe Salembier por confiar en mi al encargarme este trabajo. Por su dedicación e
interés durante todo este tiempo. Es una persona excelente, con la que he podido aprender
mucho.
A todo el Grupo de Imagen del departamento por el buen ambiente dentro del éste, y por
tener la puerta de sus despachos siempre abierta para cualquier problema que tuviera.
A Henri Sanson y a los participantes del proyecto CTI por su dedicación y buenos consejos
para llevar a buen término este trabajo.
Qué serı́a una tesis sin companẽros de doctorado. Con ellos he estado trabajando estos
últimos anõs. Ellos han animado el dı́a a dı́a y de ellos tendré muchos y buenos recuerdos.
Finalmente, quisiera agraceder también a mis amigos y mi familia todo el apoyo recibido
durante éstos años.
viii
Notation
The most frequently used notation is listed below. Other notation appearing in this document
is valid only in the section it is used.
W
supremum
V
infimum
X, Y binary set
CC p (X) connected component of X containing p
f, g function or image
NG maximum possible gray-level value (usually NG = 255)
f (p) function value at pixel position p
Xh (f ) upper level set with parameter h
X h (f ) lower level set with parameter h
UpCC kh k’th connected component of Xh (f )
LowCC hk k’th connected component of X h (f )
Q query region
T target region
Q = {Qi } multiple query region
T = {Ti } multiple target region
NR Q number of regions of Q
Ddesc (R) “desc” descriptor associated to region R
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Region Adjacency Graph based processing . . . . . . . . . . . . . . . . 2
1.1.2 Tree based processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Tree vs. Region Adjacency Graph . . . . . . . . . . . . . . . . . . . . 4
1.2 General objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 General Framework 9
2.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Hierarchical region based processing using trees . . . . . . . . . . . . . . . . . 12
2.2.1 Base terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Notation and properties . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Quadtree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2 Partition Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.3 Critical Lake Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.4 Area Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.5 Inclusion Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 Objectives and contribution of the thesis . . . . . . . . . . . . . . . . . . . . . 28
2.5 General framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.1 Tree construction (Part I) . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5.2 Tree processing (Part II) . . . . . . . . . . . . . . . . . . . . . . . . . . 30
xii CONTENTS
I Tree construction 33
3 Max-Tree 35
3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Min-Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Efficient implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 Tree complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7 Conclusions 193
7.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.2 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Bibliography 205