Keywords

1 Introduction

In medical imaging, tracking dynamic structures has a major interest in some clinical applications to aid in the diagnosis. Examples include the study of the function of the heart pump. This study requires the detection of the left ventricle during the cardiac cycle in order to estimate the quantity of blood pumped in the corresponding time interval. Three-dimensional segmentation by slices (known as \( 2{\text{d }} + \frac{1}{2} \) segmentation) can also join the problem of dynamic structures segmentation, insofar as the shape of an anatomical structure can vary according to the third spatial dimension (z). In both cases, the fundamental theoretical problem is to track a structure whose shape changes via a set of images; whether it is in temporal sequences of images or in successions of three-dimensional slices. Compared to other fields of object tracking, this problem remains difficult because of the poor quality of medical images (low contrast, low resolution, the presence of noise, etc.).

To address this issue, we plan to develop a method based on Active Shape Model – ASM [5]. Indeed, the ASM is a particular class of deformable models which is based on a forte a priori knowledge of the shape. This reduces the solutions space and leads always to acceptable shapes. The key idea in this paper is to integrate, in this model, new a priori knowledge about the shape variation over time. The objective is to define a new method well adapted for tracking dynamic structures, incorporating a priori knowledge on the spatio-temporal shape variation. The proposed method requires two main phases: a training phase which aims to model, from a set of sample sequences, the spatio-temporal variation of the shape of interest and a tracking phase, which is based on the obtained model to find the studied structure in all images of a new sequence. This paper is organized as follows: in Sect. 2, we present a short state of the art on the segmentation of deformable structures in medical imaging. Section 3 is devoted to detail the steps of the proposed method. In Sect. 4, the proposed approach is applied to the functional and morphological study of the heart pump.

2 Related Work

The deformable models [16] are certainly the most popular approach in the field of medical images segmentation, due to their flexibility and ability to integrate a priori knowledge about the anatomical structures. The basic idea is to start with an initial coarse segmentation that will evolve gradually, according to several constraints, towards the target contours. These models have the advantage of segmenting an image by integrating a global vision of the shape of the structure to be extracted. They are widely studied and applied to the static segmentation of rigid structures, whether in the 2D case or the 3D case [7]. However, in some medical applications, it is sometimes necessary to follow up the spatio-temporal variation of non-rigid structures, whose shape varies over time. In this aim, several extensions of deformable models were proposed. For example, in [8], the authors propose to track anatomical structures in sequences of images by active contour [1] whose initialization in the image i is deduced automatically from the previous result in the image \( {\text{i}} - 1 \). In several other works [913], the sequence of images is treated in a global way and the studied shape variation is described by a single model that evolves over time. The majority of these works is focused mainly on the spatio-temporal tracking of cellular structures [7, 9] and the left ventricle of the heart [10, 11, 14, 15]. However, despite the success obtained in some cases, the quality of the results depends on the initialization step and the choice of propagation parameters. In addition, the used a priori knowledge has generally a global criterion.

3 Proposed Method

The proposed method requires three main steps: a step of spatio-temporal shape modelling, a step of grey levels modelling and a tracking step (spatio-temporal segmentation), which is based on the results of both first ones to locate the target structure in a new sequence.

3.1 Spatio-Temporal Shape Modelling

The objective of this step is to build a statistical spatio-temporal shapes model which describes precisely the variation over time of the deformable structure to be segmented. It requires, first of all, the preparation of a spatio-temporal training set, which includes all the possible configurations of this structure.

Preparation of a Spatio-Temporal Training Set.

First, we have to collect a set of sequences of different images. Every sequence must contain the same structure in the same period of time and with the same number of images. Then, we have to extract the spatio-temporal shapes by putting on the contour of every sequence, a sufficient number of landmark points on the wished contour (Fig. 1).

Fig. 1.
figure 1

Extraction of spatio-temporal shape

Given F the number of images by sequence and L the number of landmark points put on each image, the spatio-temporal shape that models a sequence i can then be represented by a vector \( {\text{S}}_{\text{i}} \):

$$ {\text{S}}_{\text{i}} = [{\text{u}}_{{{\text{i}}1}} , {\text{u}}_{{{\text{i}}2}} , {\text{u}}_{{{\text{i}}3}} , \ldots \ldots {\text{u}}_{\text{iF}} ] $$
(1)

With \( {\text{u}}_{\text{ij}} = [{\text{x}}_{{{\text{ij}}1}} , {\text{y}}_{{{\text{ij}}1}} , {\text{x}}_{{{\text{ij}}2}} , {\text{y}}_{{{\text{ij}}2}} , \ldots {\text{x}}_{\text{ijL}} ,{\text{y}}_{\text{ijL}} ] \) is a vector that models the jth shape in the ith sequence. Thus, the spatio-temporal training set will be represented by a set of spatio-temporal shapes:

$$ \left\{ {{\text{S}}_{\text{i}} \,{\text{with i}} = 1 \ldots {\text{N }}({\text{N }}\,{\text{number}}\,{\text{of }}\,{\text{sequences}})} \right\} $$

Aligning Spatio-Temporal Shapes.

After extracting spatio-temporal shapes from samples of sequences, an alignment step of shapes is required in order to put the corresponding vectors \( \left\{ {{\text{S}}_{\text{i}} } \right\} \) at a centered position. This allows to eliminate the problem of variation in position and in size and to study only the most important variation in shape between the various configurations of the studied structure. The alignment procedure of the spatio-temporal shapes has the same idea of shapes alignment in the ASM. First, it consists in taking, randomly, a spatio-temporal shape on which are aligned all the others. Then, in every iteration, a mean spatio-temporal shape is calculated, normalized and on which the others will be realigned. This process is stopped when the mean spatio-temporal shape reach some stability.

Generation of the Statistical Spatio-Temporal Shapes Model.

The aligned vectors \( \left\{ {{\text{S}}_{\text{i}} } \right\} \), resulting from the two previous steps, can be arranged in an observation matrix M (Fig. 2) whose size is \( (2{\text{lF}},{\text{N}}) \). This matrix describes both the spatial and temporal variation of the shape of the studied structure:

Fig. 2.
figure 2

Matrix describes both the spatial and temporal variation of the shape of the studied structure

The aim is to deduce from this matrix, the modes and the amplitudes of the spatio-temporal variation of the studied structure. This can be done by applying principal component analysis (PCA) on the raw data. Indeed, the main modes of spatio-temporal variation of the studied structure will be represented by the principal components deduced from the covariance matrix \( {\text{C}}_{\text{s}} \) associated with the observation matrix (Eq. 2).

$$ {\text{C}}_{\text{s}} = \frac{1}{\text{N}} \sum\nolimits_{{{\text{i}} = 1}}^{\text{N}} {{\text{dS}}_{\text{i}} {\text{dS}}_{\text{i}}^{\text{t}} } $$
(2)

where \( {\text{dS}}_{\text{i}} = {\text{S}}_{\text{i}} - {\bar{\text{S}}} \) is the deviation of the ith spatio-temporal shape \( {\text{S}}_{\text{i}} \) compared to a mean spatio-temporal shape, that is calculated:

$$ {\bar{\text{S}}} = \frac{1}{\text{N}} \sum\nolimits_{{{\text{i}} = 1}}^{\text{N}} {{\text{S}}_{\text{i}} } $$
(3)

These principal components are given by the eigenvectors of the matrix \( {\text{C}}_{\text{s}} \), such as:

$$ {\text{C}}_{\text{s}} {\text{P}}_{\text{k}} =\uplambda_{\text{k}} {\text{P}}_{\text{k}} $$
(4)

\( {\text{P}}_{\text{k}} \) is the Kth eigenvector of \( {\text{C}}_{\text{s}} \) and \( \uplambda_{\text{k}} \) is the corresponding eigenvalue. Each vector represents a variability percentage of the variables used to build the covariance matrix. The variability percentage represented by each vector is equal to its corresponding eigenvalue. In general, we can notice a very fast decreasing of the eigenvalues, which is used to classify the corresponding vectors in decreasing order. Therefore, we can choose the first t eigenvectors, which represent the important variability percentage, as principal components. Every spatio-temporal shape S can be simply represented by the mean spatio-temporal shape and a linear combination of principal components (main deformation modes):

$$ {\text{S}} = {\bar{\text{S}}} + {\text{Pb}} $$
(5)

where \( {\bar{\text{S}}} \) is the mean spatio-temporal shape, \( {\text{P}} = ({\text{p}}_{1} ,{\text{p}}_{2} ,{\text{p}}_{3} , \ldots \ldots ,{\text{p}}_{\text{t}} ) \) is the base of t principal components and \( {\text{b}} = ({\text{b}}_{1} ,{\text{b}}_{2} ,{\text{b}}_{3} \ldots \ldots ,{\text{b}}_{\text{t}} )^{\text{t}} \) is a weight vector representing the projection of the spatio-temporal shape S in the base P. Generally, the amplitude of the allowable deformation following a principal component \( {\text{P}}_{\text{k}} \) is limited as follows:

$$ - 3\sqrt {\uplambda_{\text{k}} } \le {\text{b}}_{\text{k}} \le 3\sqrt {\uplambda_{\text{k}} } $$
(6)

As a result, from the basic equation (Eq. 5) we can deduce infinity of shapes describing the spatio-temporal studied structure by choosing correctly the \( {\text{b}}_{\text{k}} \) values (Eq. 6). Equation 5 defines, then, the statistical spatio-temporal shapes model, which defines an allowable deformation space for spatio-temporal studied structure. This model will be used in the spatio-temporal localization stage to guide the evolution in such a way that it is only in the allowable space.

3.2 Grey Levels Modelling

In the tracking step, the proposed method is based on intensities information of the treated sequence. It is about finding an optimal correspondence between the properties of luminance of the treated sequence with information of luminance collected from sequences samples. For that purpose, in addition to the spatio-temporal shape modelling, it is necessary to model the grey levels information from the training sequences. The grey levels modelling consists in extracting, for each landmark point i on each image j of the sequence k and then through all the training sequences, the grey levels profile \( {\text{g}}_{\text{ijk}} \) from a segment of length n, centered in this point i and carried by its normal:

$$ {\text{g}}_{\text{ijk}} = [{\text{g}}_{{{\text{ijk}}0}} ,{\text{g}}_{{{\text{ijk}}1}} ,{\text{g}}_{{{\text{ijk}}2}} , \ldots ,{\text{g}}_{{{\text{ijkn}} - 1}} ] $$
(7)

\( {\text{g}}_{\text{ijkt}} \left( {{\text{t}} = 0 \ldots {\text{n}} - 1} \right) \) is the grey level of the tth pixel of the examined segment.

The derivative of this grey levels profile is defined by the expression 8:

$$ {\text{dvg}}_{\text{ijk}} = [{\text{g}}_{{{\text{ijk}}1}} - {\text{g}}_{{{\text{ijk}}0}} , \ldots ,{\text{g}}_{{{\text{ijkn}} - 1}} - {\text{g}}_{{{\text{ijkn}} - 2}} ] $$
(8)

\( {\text{dvg}}_{\text{ijk}} \) is a vector of size \( ({\text{n}} - 1) \), including the differences in grey levels between two successive points of the examined segment.

The normal derivative of this profile is defined by:

$$ {\text{y}}_{\text{ijk}} = \frac{{{\text{dvg}}_{\text{ijk}} }}{{\mathop \sum \nolimits_{{{\text{k}} = 0}}^{{{\text{n}} - 2}} \left| {{\text{dvg}}_{\text{ijkt}} } \right|}} $$
(9)

Through all the images in a sequence and then through all the training sequences, we can define for each landmark point i, a mean normal derivative of the grey levels given by the expression 10.

$$ {\bar{\text{y}}}_{\text{i}} = \frac{1}{\text{FN}}\sum\nolimits_{{{\text{j}} = 1}}^{\text{F}} {\sum\nolimits_{{{\text{k}} = 1}}^{\text{N}} {{\text{y}}_{\text{ijk}} } } $$
(10)

This mean normal derivative related to the point i, will be used in the stage of spatio-temporal localization to move the same point towards a better position.

3.3 Tracking Step

The objective now is to tracking the studied structure in a new sequence. A way to achieve this is to start with an initial spatio-temporal shape, which will gradually evolve towards the contours of the studied structure in all images simultaneously. This idea can provide a procedure of tracking, which consists in repeating iteratively the following four steps.

Initialization.

This step consists in putting an initial spatio-temporal shape \( {\text{S}}_{\text{i}} \) on the treated sequence. This shape can be built from a spatio-temporal shape \( {\text{S}}_{\text{a}} \) belonging to the training set:

$$ {\text{S}}_{\text{i}} = {\text{M}}\left( {{\text{k}}_{\text{i}} ,\uptheta_{\text{i}} } \right)\left[ { {\text{S}}_{\text{a}} } \right] + {\text{t}}_{\text{i}} $$
(11)

with:

$$ {\text{M}}\left( {{\text{k}}_{\text{i}} ,\uptheta_{\text{i}} } \right) = \left[ {\begin{array}{*{20}c} {{\text{k}}_{\text{i}} \cos\uptheta_{\text{i}} } & { - {\text{k}}_{\text{i}} \sin\uptheta_{\text{i}} } \\ {{\text{k}}_{\text{i}} \sin\uptheta_{\text{i}} } & { {\text{k}}_{\text{i}} \cos\uptheta_{\text{i}} } \\ \end{array} } \right]\;{\text{a matrix }}\left( { 2* 2} \right) $$

\( {\text{t}}_{\text{i}} = ({\text{t}}_{\text{xi}} ,{\text{t}}_{\text{yi}} ,{\text{t}}_{\text{xi}} ,{\text{t}}_{\text{yi}} ,{\text{t}}_{\text{xi}} ,{\text{t}}_{\text{yi}} \ldots \ldots {\text{t}}_{\text{xi}} ,{\text{t}}_{\text{yi}} ) \) a translation vector of size 2 * F * N.

\( {\text{k}}_{\text{i}} \), \( \uptheta_{\text{i}} \) and \( {\text{t}}_{\text{i}} \) are respectively the homothety, rotation and translation to be applied to every point of \( {\text{S}}_{\text{a}} \) in order to build the initialization \( {\text{S}}_{\text{i}} \).

Search for the Elementary Movement.

We will first address the problem of moving a single landmark point. Then, we will show how to calculate the elementary movement of the initial estimate \( {\text{dS}}_{\text{i}} \). Indeed, A is a particular landmark point of \( {\text{S}}_{\text{i}} \). To move the point A to the borders of the studied structure, the idea is to extract from each image j of the processed sequence, a search grey levels profile of length m pixels (with \( {\text{m}} \gg ({\text{n}} - 1) \)) which is centered in A and supported by the normal to the edge passing through this point.

Then, the point A will be represented by a matrix \( {\text{H}}_{\text{A}} \), defined as follows:

\( {\text{H}}_{\text{A}} \) is a matrix of \( {\text{m * F}} \) combinations where each column represents the search profile on the image j. Knowing that each landmark point A is defined by a mean normal derivative of the grey levels \( {\bar{\text{y}}}_{\text{A}} \), we can calculate, from the matrix \( {\text{H}}_{\text{A}} \), a new matrix \( {\text{H}}_{\text{A}}^{ '} ({\text{j}},{\text{l}}) \) which represents the difference between the grey level information surrounding the current point A and that related to the same point during the grey levels modelling. This matrix can be defined as follows:

$$ {\text{H}}_{\text{A}}^{ '} \left( {{\text{j}},{\text{l}}} \right) = ({\text{g}}_{\text{Aj}} \left( {\text{l}} \right) - {\bar{\text{y}}}_{\text{A}} )^{\text{t}} {\text{I}}_{{{\text{n}} - 1}}^{\text{t}} {\text{I}}_{{{\text{n}} - 1}} ({\text{g}}_{\text{Aj}} \left( {\text{l}} \right) - {\bar{\text{y}}}_{\text{A}} ) $$
(12)

with j from 1 to F, l from 1 to m and \( {\text{I}}_{{{\text{n}} - 1}} \) is the identity matrix \( ( {\text{n}} - 1) \). \( {\text{g}}_{\text{Aj}} \left( {\text{l}} \right) \) is the sub-profile of length \( ({\text{n}} - 1 \)) centered at the lth position of the search profile \( {\text{g}}_{\text{Aj}} \) that contains the normal derivative of the intensities. (It is necessary to remind that \( {\text{g}}_{\text{Aj}} ({\text{l}}) \) and \( {\bar{\text{y}}}_{\text{A}} \) have the same size \( ({\text{n}} - 1) \)). The best positions to which has to slide the point A. on the treated sequence are given as follows:

$$ {\text{p}}_{\text{Aj}} = [{ \hbox{min} }({\text{H}}_{\text{A}}^{ '} ({\text{j}},{\text{l}}))] . $$
(13)

with j from 1 to F, l from 1 to m

  • pAj: is the position to which has to slide the point A on the jth image.

Therefore, we can calculate the elementary movements of the particular point A in all the images of the sequence, such as:

$$ \left\{ { {\text{dp}}_{\text{Aj}} = {\text{distance}}\left( {{\text{A}}, {\text{p}}_{\text{Aj}} } \right)\,{\text{with }}\,{\text{j }}\,{\text{from }}\,1 \,{\text{to }}\,{\text{F }}} \right\}. $$
(14)
  • \( {\text{dp}}_{\text{Aj}} \): is the elementary movement of point A in the jth image.

By applying the same principle for the other landmark points of the initial spatio-temporal shape \( {\text{S}}_{\text{i}} \), we can deduce finally the elementary movement \( {\text{dS}}_{\text{i}} \):

$$ {\text{dS}}_{\text{i}} = \left[ {{\text{dp}}_{\text{Aj}} } \right] $$
(15)

with A from 1 to L, j from 1 to F

  • L: Number of landmark points.

  • F: Number of images by sequence.

Determining the Parameters of Position and Shape.

After determining the elementary movement \( {\text{dS}}_{\text{i}} \), we must now determine the parameters of position and shape to make this movement, while respecting the constraints of spatio-temporal deformation imposed by the modelling step.

Determining the Position Parameters: We suppose that the initial estimate \( {\text{S}}_{\text{i}} \) is centered in a position \( ( {\text{x}}_{\text{c}} ,{\text{y}}_{\text{c}} ) \) with an orientation θ and an homothety k. Determining the position parameters means determining the parameters of geometric operations \( 1 + {\text{dk}} \), \( {\text{d}}\uptheta \) and \( {\text{dt}} = ({\text{d x}}_{\text{c}} , {\text{dy}}_{\text{c}} ) \) to be applied to each point of \( {\text{S}}_{\text{i}} \) in order to reach the new position \( ( {\text{S}}_{\text{i}} + {\text{dS}}_{\text{i}} ) \). A simple way to determine these parameters is to align the two vectors \( {\text{S}}_{\text{i}} \) and \( ( {\text{S}}_{\text{i}} + {\text{dS}}_{\text{i}} ) \).

Determining the Shape Parameters: If we suppose that the initial estimate \( {\text{S}}_{\text{i}} \) is defined in the base of the principal components by a weight vector b, we seek to determine the variation \( {\text{db}} \) in order to trace \( ( {\text{S}}_{\text{i}} + {\text{dS}}_{\text{i}} ) \) in the same base. Given that the initial estimate is built from a spatio-temporal shape \( {\text{S}}_{\text{a}} \) belonging to the training set, determining the shape parameters db is to solve first in dx the following equation:

$$ {\text{M}}\left( {{\text{k}}_{\text{i}} \left( {1 + {\text{dk}}} \right),\uptheta_{\text{i}} + {\text{d}}\uptheta} \right)\left[ { {\text{S}}_{\text{a}} + {\text{dx}}} \right] + {\text{t}}_{\text{i}} + {\text{dt}} = {\text{S}}_{\text{i}} + {\text{dS}}_{\text{i}} $$
(16)

which means

$$ {\text{M}}\left( {{\text{k}}_{\text{i}} \left( {1 + {\text{dk}}} \right),\uptheta_{\text{i}} + {\text{d}}\uptheta} \right)\left[ { {\text{S}}_{\text{a}} + {\text{dx}}} \right] = {\text{S}}_{\text{i}} + {\text{dS}}_{\text{i}} - ({\text{t}}_{\text{i}} + {\text{dt}}) $$
(17)

But we have \( {\text{S}}_{\text{i}} = {\text{M}}\left( {{\text{k}}_{\text{i}} ,\uptheta_{\text{i}} } \right)\left[ { {\text{S}}_{\text{a}} } \right] + {\text{t}}_{\text{i}} \)

If we replace \( {\text{S}}_{\text{i}} \) by its value in the Eq. (17), we find:

$$ {\text{M}}\left( {{\text{k}}_{\text{i}} \left( {1 + {\text{dk}}} \right),\uptheta_{\text{i}} + {\text{d}}\uptheta} \right)\left[ { {\text{S}}_{\text{a}} + {\text{dx}}} \right] = {\text{M}}\left( {{\text{k}}_{\text{i}} ,\uptheta_{\text{i}} } \right)\left[ { {\text{S}}_{\text{a}} } \right] + {\text{t}}_{\text{i}} + {\text{dS}}_{\text{i}} - ({\text{t}}_{\text{i}} + {\text{dt}}) $$
(18)

But we know that:

$$ {\text{M}}^{ - 1} \left( {{\text{k}},\uptheta} \right)\left[ \ldots \right] = {\text{M}}\left( {{\text{k}}^{ - 1} , -\uptheta} \right)\left[ \ldots \right] $$
(19)

By applying this rule to the Eq. (18), we obtain

$$ {\text{S}}_{\text{a}} + {\text{dx}} = {\text{M}}\left( {({\text{k}}_{\text{i}} \left( {1 + {\text{dk}}} \right))^{ - 1} , - (\uptheta_{\text{i}} + {\text{d}}\uptheta )} \right) [{\text{M}}\left( {{\text{k}}_{\text{i}} ,\uptheta_{\text{i}} } \right)\left[ { {\text{S}}_{\text{a}} } \right] + {\text{dS}}_{\text{i}} - {\text{dt}} $$
(20)

what means that:

$$ {\text{dx}} = {\text{M(}}({\text{k}}_{\text{i}} (1 + {\text{dk}}))^{ - 1} , - (\uptheta_{\text{i}} + {\text{d}}\uptheta ) ) \left[ {{\text{M}}\left( {{\text{k}}_{\text{i}} ,\uptheta_{\text{i}} } \right)\left[ { {\text{S}}_{\text{a}} } \right] + {\text{dS}}_{\text{i}} - {\text{dt}}} \right] - {\text{S}}_{\text{a}} $$
(21)

dx is determined in \( 2 * {\text{L*F}} \) size. However, we have t modes of variation. Then, we have to calculate \( {\text{dx}}^{ '} \), the projection of dx in the base of principal components P. This can be done by adopting the approach of least squares. Indeed, \( {\text{dx}}^{ '} = {\text{wdx}} \) with \( {\text{w}} = {\text{P}}\left( {{\text{P}}^{\text{t}} {\text{P}}} \right)^{ - 1} {\text{P}}^{\text{t}} \) is a projection matrix. However, the principal components of P are pairwise orthogonal, meaning that \( {\text{P}}^{\text{t}} {\text{P}} = {\text{I}} \). This, then, gives \( {\text{dx}}^{ '} = {\text{PP}}^{\text{t}} {\text{dx}} \). We know \( {\text{dx}}^{ '} = {\text{Pdb}} \), if we multiply both sides of this equation by \( {\text{P}}^{\text{t}} \), we can deduce finally the shape parameters \( {\text{db}} = {\text{P}}^{\text{t}} {\text{dx}}^{ '} \).

\( {\text{db}} = ({\text{db}}_{1} ,{\text{db}}_{2} ,{\text{db}}_{3} , \ldots ,{\text{db}}_{\text{t}} ) \) is a weight vector allowing to build and to limit the new vector \( ({\text{S}}_{\text{i}} + {\text{dS}}_{\text{i}} ) \) in the base of principal components.

Movement of the Spatio-Temporal Shape and the Limitation of the Shape Parameters.

This last step consists in moving \( {\text{S}}_{\text{i}} \) to the new position \( ( {\text{S}}_{\text{i}} + {\text{dS}}_{\text{i}} = {\text{S}}_{\text{i}}^{1} ) \), by using the already calculated parameters. We obtain:

$$ {\text{S}}_{\text{i}}^{1} = M\left( {k_{i} \left( {1 + dk} \right),\theta_{i} + d\theta } \right)\left[ { S_{a} + Pdb} \right] + t_{i} + dt $$
(22)

We should note that the shape parameters \( {\text{db}} = ({\text{db}}_{1} ,{\text{db}}_{2} ,{\text{db}}_{3} , \ldots {\text{db}}_{\text{t}} ) \) must be limited in the allowable intervals of variation defined by the Eq. (6), to produce acceptable spatio-temporal shapes. Indeed, if for example a value \( {\text{db}}_{\text{k}} \) (\( 1 \le {\text{k}} \le {\text{t}} \)) exceeds the maximum value in a component k, it will be limited as follows:

$$ \left\{ {\begin{array}{*{20}c} {{\text{if}}\, {\text{db}}_{\text{k}} > {\text{v}}_{{{ \hbox{max} }_{\text{k}} }} } \\ {\quad \quad \quad {\text{then}} \,{\text{db}}_{\text{k}} = {\text{v}}_{{{ \hbox{max} }_{\text{k}} }} } \\ {{\text{if}} \,{\text{db}}_{\text{k}} < - {\text{v}}_{{{ \hbox{max} }_{\text{k}} }} } \\ { \quad \quad \quad {\text{then}} \,{\text{db}}_{\text{k}} = - {\text{v}}_{{{ \hbox{max} }_{\text{k}} }} } \\ \end{array} } \right. $$
(23)

with \( {\text{v}}_{{{ \hbox{max} }_{\text{k}} }} = 3\sqrt {\left| {\uplambda_{\text{k}} } \right|} \) is the maximum value of allowable variation following the component k. \( \uplambda_{\text{k}} \) is the eigenvalue related to the component k. Now, from \( {\text{S}}_{\text{i}}^{1} \), we will repeat the same steps to build \( {\text{S}}_{\text{i}}^{2} \) then \( {\text{S}}_{\text{i}}^{3} \) … and so on, until no significant change is detected or the maximum number of iterations is reached.

4 Experimental Results

In this section, we show the application of our contribution for the functional and morphological study of the heart pump (left ventricle). The functional aspect was studied through temporal sequences of scintigraphic images and morphology was studied through MRI volumes.

4.1 Tracking of the Left Ventricle in Scintigraphic Sequences of Images of the Heart

In this application, the used image database contains 25 scintigraphic sequences of images of the heart from 25 different patients. Each sequence shows a heart beat cycle, represented by 16 images of 128 * 128 pixels. We have selected 8 sequences for the training step. After showing the selected sequences to a specialist doctor, we concluded that the details of the left ventricle can be represented by 20 landmark points. The variability percentage of the initial data is fixed to 95 % and the length of the grey levels profile in the training step is 7 pixels. In the tracking step, the length of profile search is 19 pixels. The maximum number of iterations is fixed at 60 iterations. The used test sequences are selected from the 17 remaining sequences in the original database. Figure 3 shows an example of the result of tracking that is obtained on a test sequence. Figure 4 shows the activity-time curve of the left ventricle calculated using the obtained result. The ventricular ejection fraction found in this experiment is 52 %, which seems to be normal (healthy cases).

Fig. 3.
figure 3

Result of the spatio-temporal tracking of the LV. (a) Initialization of the mean spatio-temporal shape on a treated sequence. (b) Final result of the tracking

Fig. 4.
figure 4

activity-time curve

We note that the final spatio-temporal shape succeeded generally in locating the shape of the left ventricle. This can show the performance of the method even in the presence of contours that are difficult to identify. This result is qualitatively considered satisfactory by the medical specialists.

4.2 \( 2{\text{d}} + \frac{1}{2} \) Segmentation of the LV in MRI Volumes

In this section, we present the second application of the proposed method: the \( 2{\text{d}} + \frac{1}{2} \) segmentation, where the temporal component will be considered as the third spatial axis (z). We are interested in studying the left ventricle morphology via the extraction of its three-dimensional shape from MRI volumes.

For that purpose, we first built a training set, composed of 10 volumes. Each volume is divided into 10 slices (short axis) of size 192 * 160, covering the left ventricle from apex to the base. During the modelling phase, 20 landmarks are placed on each slice of each volume in order to extract the three-dimensional shapes of the left ventricle (Fig. 5). Each volume is thus modelled by a three-dimensional vector of size 2 * 20 * 10. After the aligning step and using 95 % as a variability percentage of the initial data, the CPA application on these data provided five principal variation modes. The length of the grey levels profile in the modelling step is equal to 30 % of the length of the search profile. The maximum number of iterations is fixed at 60 iterations. In the following, Fig. 6 shows a plane representation of the initialization of the mean three-dimensional shape on each slice on a treated volume. Figure 7 shows the result of corresponding segmentation. 3D visualization of some obtained results is given in Fig. 8.

Fig. 5.
figure 5

Three-dimensional shape of the LV determined during the training step

Fig. 6.
figure 6

Initialization of the mean three-dimensional shape on a treated volume

Fig. 7.
figure 7

Result of \( 2{\text{d}} + \frac{1}{2} \) segmentation

Fig. 8.
figure 8

3D visualization of some obtained results

We can notice, in Fig. 7, that the contours have succeeded to correctly delineate the shape of the left ventricle (endocardium) in all slices, even with the presence of the pillars muscle (black point) in some cases. This demonstrates that the proposed method can be effectively used for three-dimensional segmentation. Indeed, the forte a priori knowledge of shape used following the third spatial component (in each slice separately), helped to easily exceed the heterogeneity of intensities (pillars) and thus lead to a satisfactory result. In Fig. 8, we can see a significant variation of the three-dimensional size of the left ventricle. This can provide information about the moment of acquisition of the volume in question (acquisition phase: diastolic or systolic phase). This three-dimensional representation of the left ventricle shape can be useful to specialists for detecting morphological abnormalities of the heart. We should note that, in this application, the proposed method was used with some hypotheses. Indeed, we used, in both modelling and tracking step, volumes with the same number of slices (10 slices), that ideally match. However, that is not always true in clinical practice, because of the distance parameter between slices at the moment of acquisition. It is then necessary to define a pre-processing step in order to make correspondence, automatically, between slices of training volumes and those of the volume to be segment.

Quantitative Results.

In order to evaluate quantitatively the proposed method, we selected four sequences of the test database, which are manually pre-segmented by an expert and are used as a ground truth. Then, we decided to compare our contribution ASMT with the ground truth, the basic model ASM and with another method that is proposed by Fekir et al. [8]. This method allows the tracking of non-rigid objects in sequences of images using active contour SNAKE [1] whose initialization in the image i is automatically deduced from the result in the image \( {\text{i}} - 1 \).

Since the compared methods are contour-based methods, we chose the Hausdorff distance as a measure of segmentation quality [16]. Figure 9 shows the Hausdorff distance between each method (ASM, ASMT and SNAKE) and the reference segmentation of the four sequences.

Fig. 9.
figure 9

Hausdorff distance between each method (ASM, ASMT and SNAKE) and the reference segmentation of the four sequences

Looking at the four diagrams, we can see clearly that the red curve (ASMT) has some stability compared to the other curves (blue and green). Indeed, for the red curve and through the four diagrams, the values of the Hausdorff distance are between 1.18 and 7.38 (mm). By cons, for both blue and green curves, the values of the Hausdorff distance often represent great variations, which rose from 2.3 (mm) and reach 23.06 (mm). Through these measures, and although that in some cases the ASM and SNAKE provide acceptable results (especially in diastole images), we can deduce that our method provides for all images in each sequence an overall result that is more stable and closer to manual segmentation. This proves the effectiveness of the integration of a priori knowledge about the spatio-temporal shape variation of the left ventricle. Indeed, the stage of spatio-temporal shape modelling provides more precise information on the spatial shape variation of the left ventricle at every moment of the cardiac cycle. This is what influences, consistently and in each image of the sequence, the accuracy of the results of the localization stage. The poor results obtained by ASM and SNAKE may be explained by the imprecision of the initialization in some images of the sequences and the generality of the a priori information about the shape. Besides, these results are mainly obtained in systole images (contraction stage) where the size of the left ventricle becomes very small and difficult to detect. In conclusion, it is clear that the integration of a priori knowledge about the spatio-temporal shape variation of the left ventricle improved significantly the results of segmentation. This increases the reliability of diagnostic parameters such as the activity-time curve and the ventricular ejection fraction, whose calculation is based on these results.

5 Conclusion

In this paper, we proposed to incorporate a new a priori knowledge about the spatio-temporal shape variation in the active shape model in order to define a new simple and more stable method for detecting structures whose shape change over time. The proposed method is based on two types of a priori knowledge: the spatial and temporal variation of the studied structure. It has also the advantage of being applicable on sequences of images. The experimental validation of the proposed method, whether on temporal sequences or three-dimensional images shows the interest of the integration of a priori knowledge on shape variation according to the third component (temporal or spatial). Indeed, using precise information (geometry and deformation modes) about the shape to be detected each time (respectively in each slice) provided more stable results. We are convinced of the relevance of the used method, however, some improvements can be added and the validation should be pursued. Indeed, the most difficult step in our approach is the labelling step. It consists in manually extracting the spatio-temporal shapes from training sequences. That is why, it is usually performed by an expert. The complexity of this step is in function of the number of training sequences, the number of images by sequence and the number of landmark points needed to represent the target structure details. Once, these parameters become important, this task becomes tedious and time consuming. Then, we should think to make this task semi-automatic or fully automatic. A way to make it semi-automatic is to consider that the shape of the studied structure at instants \( {\text{t}} - 1 \), t and \( {\text{t }} + 1 \) has a low variation. The manual training can be thus done on a reduced number of images which correspond to well chosen moments of the sequence. Then, the result of this training will be used for the automatic segmentation of the remaining images. This segmentation is then considered as training. Thus, the complexity of the labelling task can be reduced at least 70 %. Moreover, it is possible to enrich and further validate this approach for other types of applications. For example, if we replace the temporal component by the third spatial axis (z), this method can be effectively used for volume segmentation that is based on an important a priori knowledge of shape. In this case, we must solve some additional issues such as correspondence between the slices of training volumes and the slices of the volume to be segmented as well as the automatic determination of the slices that contain the studied structure during the segmentation.