Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Overview of Independent Component Analysis Technique with an Application to Synthetic Aperture Radar (SAR) Imagery Processing Simone Fiori Faculty of Engineering { Perugia University Loc. Pentima bassa, 21, I-05100 Terni (Italy) E-mail: @ sfr unipg.it Keywords: Blind signal processing by neural networks; Independent component analysis (ICA); Remote sensing; Synthetic aperture radar (SAR). Pages: 38, Figures: 9, References: 100. Accepted for publication on: Neural Networks (Special issue on `Neural Networks for Analysis of Complex Scienti c Data: Astronomy, Geology and Geophysics') October 15, 2002 Overview of Independent Component Analysis Technique with an Application to Synthetic Aperture Radar (SAR) Imagery Processing Simone Fiori Abstract We present an overview of independent component analysis, an emerging signal processing technique based on neural networks, with the aim to provide an up-to-date survey of the theoretical streams in this discipline and of the current applications in the engineering area. We also focus on a particular application, dealing with a remote sensing technique based on synthetic aperture radar imagery processing: We brie y review the features and main applications of synthetic aperture radar and show how blind signal processing by neural networks may be advantageously employed to enhance the quality of remote sensing data. Keywords. Blind signal processing by neural networks; Independent component analysis (ICA); Remote sensing; Synthetic aperture radar (SAR). 1 Introduction Over the last few years there has been an enormous increase in the interest of applications of neural networks to astronomy, geology, geophysics and environment protection. These research e orts have addressed a variety of problems ranging from the management of scienti c equipment to the representation and interpretation of measured data. These applications comprise the use of neural networks in astronomy, for e.g. analysis of stellar light curves and solar data (Tagliaferri, Ciaramella, Milano, Barone & Longo, 1999), the analysis of photometric images as e.g. for star/galaxy separation, deblending of unresolved images and morphological classi cation of galaxies (Andreon, Capuano, Gargiulo, Longo & Tagliaferri, 2000; Odewahn, Stockwell, Pennington, Humphreys, & Zumach, 1992), the analysis of spectroscopic data (e.g. for spectral classi cation of stars and galaxies) the retrieval of information from very large databases of astronomical data 1 (Longo, Tagliaferri, Sessa, Ortiz, Capaccioli, Ciaramella, Donalek, Raiconi, Staiano, & A. Volpicelli, 2001; Longo, Donalek, Raiconi, Staiano, Tagliaferri, Sessa, Pasian, Smareglia & Volpicelli 2002), the automatic scheduling and the automatic evaluation of the performances of observing facilities. The known applications also comprise the use of arti cial neural networks in geology and geophysics for e.g. stratigraphy, cycles detection and paleomagnetism (Tagliaferri, Pelosi, Ciaramella, Longo, Barone & Milano, 2001), the analysis of seismic data and geologic images (Acernese, Ciaramella, De Martino, De Rosa, Falanga & Tagliaferri, 2002; Acernese, Barone, De Rosa, Eleuteri, Milano, Tagliaferri, 2001), the analysis of synthetic aperture radar images for geological and geophysical prospecting (Zhong, Wicks, Power, Dzurisin Thatcher & Masterlark, 2002; Dowd, Vachon, Dobson & Olsen, 2001; Legarsky, Gogineni & Akins, 2001; Schulz-Stellen eth, Horstmann, Lehner & Rosenthal, 2001; Lin, Alpers, Khoo, Lim, Lim & Kasilingam, 2001; Horstmann, Koch, Lehner & Tonboe, 2000), and environmental data analysis (Andretta, Eleuteri, Fortezza, Manco, Mingozzi, Serra & Tagliaferri, 2000). The value and variety of these applications suggest the need of a cumulative investigation on the produced neural techniques, algorithms and theories for the analysis of complex scienti c data in astronomy, geology and geophysics. In fact, in spite of the enormous amount of work done, no comprehensive summaries of the main techniques have appeared so far. These reviews would be worthwhile for all people active in the eld. In particular, in this paper we focus on neural independent component analysis with application to remote sensing by synthetic aperture radar imagery processing. The Independent Component Analysis (ICA) is a well-established statistical signal/data processing technique that aims at decomposing a set of multivariate signals into a base of statistically independent data-vectors/streams with the minimal loss of information content. The main two recognized purposes of ICA are:  Linear blind source separation : In this case the aim is to recover a number of statistically independent signals from their unknown linear mixtures, under simple consistency conditions. Namely, a linear mixture of independent source signals is supposed to be observed, and on the basis of these only available information, the original source signals are recovered from 2 their mixtures. The classical hypotheses made on the unknown sources are: 1) each source signal is an independent identically distributed (IID) stationary random process; 2) the source signals are statistically independent at any time; 3) at most one among the source signals has Gaussian distribution. This minimal set of requirements ensures the existence and uniqueness of the solution to the blind separation problem (but for ordering, sign, and power scaling), the minimal loss of information about the sources and, as a by-product, the identi cation of the mixing-model features;  Data representation and visualization : High-dimensionality data/signals are dicult to handle and to visualize, but often contain signi cant redundancies, which make their actual information-structure dimensionality considerably lower than their representation dimensionality. These concurrent facts suggest the possibility to design signal/data processing algorithms capable of nding a suitable lower-dimensionality representation of the signals at hand by reducing the statistical dependencies among them. In this context, the independent component analysis technique has proven to provide a suitable solution through the concept of independent latent variables: The ICA may discover a linear projection of the data into a low-dimensional basis of statistically independent signals, that carry on no mutual information, thus providing a parsimonious maximally-informative representation of the original data. The basis data-streams are termed latent variables, which do not necessarily possess a recognizable physical meaning. The classical example used to informally explain the blind separation problem is the \cocktail party" scenario: Let us imagine a number of people stand in a room and speak together; if the room is equipped with a number of microphones, each sensor receives a di erent superposition of the speech signals uttered by each person in the room, so that the set of received signals may be described by a linear mixture model, that takes into account the emitters-sensors geometry and the pressure waves propagation phenomena. The aim of blind source separation algorithm is to recover, from sensor observations only, the single independent signals uttered by the people. An example that helps clarifying the concept of latent variables extraction 3 is the analysis of vibrating machines: Le us suppose a rotating machine (e.g. a rotating pump under fault test) is equipped with a number of accelerometers that measure its vibration intensities versus time; we know that the recorded accelerometer signals are originated by a large number of small vibrating parts, that exceeds the reasonable number of di erent measurements, which in turn may exceed the number of reasonable signals we can process for fault testing. By seeking for a compact representation of the measured data by the ICA technique, the obtained basis of independent signals are the complex vibrations of a small number of virtual oscillators whose linear superposition generate the observed data. A dual-pair of interesting examples that further clari es the independent latent variables concept is related to the analysis of natural images and natural sounds: In a classical paper by Bell & Sejnowski (1997) it was shown that the latent variables for natural scenes are edge lters, that closely resemble the Gabor lters, while in a recent dual paper, Abdallah & Plumbley (2001) investigated the same question for natural sounds; it is also worth mentioning that recently Seifritz, Esposito, Hennel, Mustovic, Neuho , Bilecen, Tedeschi, Scheer & Di Salle (2002) have investigated the temporal decomposition of sound-evoked, blood oxygen level-dependent signal response into independent transient and sustained components, which predominate in di erent portions (core and belt) of the a ditory cortex, con rming the suitability of ICA techniques for the analysis and explanation of some brain-related activities. It is de nitely apparent that much theoretical research work has been carried out by several researchers over recent years in order both to produce ever rened algorithms for performing independent component analysis, and to publish consistent results about the basic theory of ICA (concerning the signals models and the solvability of the analysis problem related to these models) and about the algorithmic-level theory (concerning the theoretical study of the relevant properties of the di erent classes of algorithms, such as convergence, reliability, computational burden, equivariancy, and implementation questions). The theoretical research work carried out so far has been summarized in two books (Lee, 1998; Hyvarinen, Karhunen & Oja, 2001) and in a series of journal papers (Cardoso, 1998; Cardoso 1999; Hyvarinen & Oja, 2000; Fiori, 2000a; Fiori 2001b). Concurrently, the scienti c community has manifested an ever increasing in4 terest in the ICA technique and, generally speaking, in the blind signal processing research eld, because it provides a powerful tool for signal/data processing, comparable in importance to the older principal component analysis. Known applications range from speech recognition to fault detection, from telecommunications to medical imaging, from nancial data market analysis to biological data pre-processing, and from industrial plant identi cation to non-destructive evaluation (Clemente & Acha, 1997; Karhunen, Hyvarinen, Vigario, Hurri & Oja, 1997; Liu, 1996; Cichocki & Cao, 1998; Cichocki, Karhunen, Kasprzak & Vigario, 1999). Synthetic aperture radar refers to a technique used to synthesize a very long antenna by combining signals received by the radar as it moves along its ight track, or, in other terms, a synthetic aperture is constructed by moving a real aperture or antenna through a series of positions along the ight track. As the radar moves, a pulse is transmitted at each position and the return echoes pass through the receiver and are stored in the memory of the acquiring equipment. Synthetic aperture radar is a technique used to generate radar images in which ne details can be resolved. SARs provide unique capabilities as an imaging tool. Because they provide their own illumination (the radar pulses), they can image at any time of day or night, regardless of illumination, and because the radar wavelengths are longer than those of visible or infrared light, SARs imaging is independent of cloudy and dusty conditions. The aim of the present paper is to provide a non-mathematical overview of the independent component analysis problem, its models and the state-of-theart algorithms, as well as of some interesting applications in the engineering area (section 2). After providing a brief overview of the remote sensing problem (section 3), an application of ICA technique to synthetic aperture radar image enhancement is discussed with experiments on real-world data (section 4). 2 Overview of Independent Component Analysis The number of available contributions in the independent component analysis eld is motivated by the wide variety of di erent observed-signals models considered in the applied areas. An informal but suciently representative list of models is given by: 5  Instantaneous or convolutional mixture : When the relationship describing  Linear or non-linear mixing : In the simplest (though much representative)  Real-valued or complex-valued models : In some applied elds, such as e.g.  Square, over- and under-determined mixtures : The number of available  Noiseless or noisy mixtures : The signals model may or may not take  Stationary or non-stationary model : The hypothesis of having IID source the formation of the observed signals/data from the source/latent signals does not take into account the temporal structure of the latter, i.e. the linear mixing operator is constant over time, the model is termed instantaneous. Otherwise, a convolutional model can be considered, which described the observed data/signals as the results of multiple linear ltering applied to the source/latent streams; in this case the linear mixing operator may be represented as a time-sequence of mixing matrices, a matrix of lters, or a matrix of functions in the frequency-domain; case the mixing model is linear. Otherwise, some kinds of non-linear structures have been considered, such as the post-linear mixing model, in order to take into account e.g. the non-linear distortion introduced by the measurement systems; in telecommunications, it is useful to treat the involved signals as complexvalued data-streams, thus dedicated algorithms have been developed for complex-valued models handling; measures may be equal to the number of source signals in blind separation, in which case the mixture is termed square; from a theoretical point of view this hypothesis allows carrying out some useful mathematics, while from a practical point of view it is not always realistic. The cases where the number of sources is larger or smaller than the number of observations are properly represented by the over- and under-determined models. Similar considerations may be carried out when latent-variable models are dealt with; into account the possible presence of additive disturbance a ecting the measured signals; or latent signals is not always realistic; real-world random signals may exhibit time- uctuations of their statistical features, and this phenomenon 6 may a ect the performance of the independent component analysis if not properly taken into account. In opposition to stationary models, nonstationary ones try to capture the non-stationarity of the sources and to make the related ICA algorithm take advantage of this knowledge. 2.1 Overview of independent component analysis methods The roots of basic ICA can be traced back to the work of Darmois in the 1950's and Rao in the 1960's, concerning the characterization of random variables in linear structure (Darmois, 1953; Kagan, Linnik & Rao, 1973). The pioneering work on independent component analysis was carried out by Jutten & Herault (1988; 1991), who introduced an adaptive algorithm in a feedback neuromimetic multiple lter. Their approach was further developed by Cichocki & Umbehauen (1996). Later on, Comon (1994) formalized and developed the basic theory of independent component analysis, concerning the general conditions of existence, uniqueness and estimation indeterminacies; Comon also cast the ICA-computation problem as an optimization one and proposed a class of cost functions termed discriminant contrasts. The contrast-based approach to ICA has been further studied by Comon & Moreau (1997) who introduced simpli ed contrasts for certain classes of signals (i.e. for signals with positive or negative kurtosis). On the basis of simpli ed contrasts (i.e. on kurtosis optimization), Cardoso & Laheld (1996) proposed an adaptive algorithm relying on the relative gradient, which has been proven to enjoy the very desirable property of equivariancy, that is, the source extraction quality in blind source separation is independent of the mixing matrix and, in particular, from its conditioning number. Also, Delfosse & Loubaton (1995) proposed an algorithm based on a de ation procedure, whose idea has been recently developed by Thawonmas, Cichocki & Amari (1998) resulting in a cascade neural network. The de ation or sequential approach to ICA consists in extracting one component at a time from the original data, de ated from the previously-extracted components, while non-sequential or parallel approaches try to extract all the required components concurrently. Parallel independent component analysis algorithms may su er from convergence diculties, owing to the complexity of the search space, and from computational burden, especially for large-dimensionality data; on the other hand, 7 sequential algorithms may su er from error accumulation from component to component. An experimental comparison presented in (Fiori, 2001b) suggests a hybrid sequential-parallel approach, i.e. sequential extraction of groups of components. Concurrently to statistical signal processing techniques, unsupervised learning rules based on information theory were proposed. The aim was to maximize the mutual information between the inputs and the outputs of a neural network, so that each neuron matches features being as statistically independent as possible from the other neurons. Nadal, Brunel & Parga (1998) showed that in the low-noise case, the maximum of the mutual information between the input and the output of a network implies that the output joint probability density function (pdf) factorizes as a product of marginal probability density functions. Roth & Baram (1996) and Bell & Sejnowski (1996), independently devised stochastic gradient learning rules for ensuring mutual information maximization and applied them to forecasting and time-series analysis, blind separation of sources and blind deconvolution, respectively. Their approach was further developed by Yang and Amari (1997). Girolami & Fyfe (1997) employed neural exploratory projection pursuit algorithms for achieving separation. Generalized Hebbian learning algorithms for ICA have been developed by Karhunen, Oja, Wang, Vigario & Joutsensalo (1997), Hyvarinen & Oja (1998), Oja (1997) and Fiori (2000b; 2001b). The techniques based on maximum likelihood estimation (Pham, Garrat & Jutten, 1992; Pearlmutter & Parra, 1996; Belouchrani & Cardoso, 1995) and the recent proposal of Bayesian ICA (Knuth, 1998; Roberts, 1998) and maximum-a-posteriori (MAP) parameter estimation for blind source recovering (Parra, Mueller, Spence, Ziehe & Sajda, 2000) are of interest because they enable us to cast the ICA-computation problem in a way that allows to take into proper account the a-priori information on the sources or the signal models. As a meaningful example, in blind localization of sources emitting pressure or electromagnetic waves, the known features of the propagation model and physical constraints may be inserted into the separation algorithm. Also, Sagi, Nemat-Nasser, Kerr, Hayek, Downing & Hecht-Nielsen (2001) recently developed a new technique based on the `cortronic' neural network, a biologicallyplausible neural model, while Welling and Weber (2001) proposed a EM-type learning procedure for ICA computation by neural networks. The basic principle that the MMI-ICA technique is based on, is that the 8 mixing phenomena transform a set of statistically independent random signals into a set of statistically-dependent signals. Thus, in order to attain separation, the neural network should learn how to make the network's outputs become as independent as possible; under proper consistency conditions, this ensures the output coincide to the original independent sources, up to arbitrary order permutation and components scaling. In principle, the independence of the network's outputs may be measured in terms of the output signals joint probability density function and marginal densities: If these signals are statistically independent, the joint probability factorizes into the product of the marginal densities. A way to achieve separation is thus to de ne a measure of the mismatch between these two quantities, and a learning algorithm to learn the network connection-matrix in order to minimize such disagreement. A useful and widely employed measure of statistical dependency is the mutual information between network's output signals. Once that an appropriate criterion has been de ned as a function of network's connection-matrix, the network learning phase may be formally conceived as an optimization procedure allowing to search for the connection pattern that minimizes or maximizes the criterion. As improvements to original gradient-based MMI-ICA learning rules by Bell & Sejnowski (1996), several new optimization techniques, oriented to ecient extraction of independent components, have been recently developed, such as the xed point technique (Hyvarinen & Oja, 1997), the natural-gradient one by Amari (Amari, 1998; Yang & Amari, 1997), and the Riemannian gradient on Stiefel manifold and orthogonal group (Fiori, 2001a; Fiori, 2002b; Fiori, 2002d). A problem arising in MMI-ICA algorithms implementation by neural networks is source adaptivity: The MMI principle is based on non-linear transformations, implemented by the neurons' activation functions, which depend on components' probability density functions; these are of course unknown and need to be estimated from the networks' outputs. Some adaptivity is achieved by the use of adaptive activation function neurons, realized by the help of the mixture-of-kernels technique (Xu, Cheung, Yang & Amari, 1998; Xu, Cheung & Amari, 1998), standard multilayer perceptrons (Taleb & Jutten, 1997), quasiDirac-kernel functions (Gusta son, 1998) and quasi-polynomial exible functions as well as discrete-histograms (Fiori 1999; Fiori 2000a; Fiori 2002c; Fiori & Bucciarelli, 2001). 9 Non-conventional neural optimization techniques have been recently applied to blind separation: As three examples, the present author employed mechanical-type learning algorithms to blind separation by the ICA (recently extended to complex-valued sources) (Fiori, 2000b; Fiori 2002a), Prieto & Puntonet (1998) and Puntonet & Prieto (1998) developed a purely geometrical approach to be applied in presence of unimodal-symmetrical sources density distributions, while Yoshioka and Omatu (1998) applied a genetic algorithm to minimize ICA cost functions. Other algorithms have been proposed from di erent perspectives by several authors. It is in fact known that independent component analysis relies on some assumptions which limit its elds of application: Researchers have recently tried to overcome this problem by extending the classical algorithms to convolutional, under-determined, non-stationary, and non-linear mixtures. The rst goal that may be considered concerning non-stationarity in ICA consists in making ICA algorithms robust with respect to possible source nonstationarity. This has been achieved by extended versions of the classical HeraultJutten algorithm for linear instantaneous mixtures and of the corresponding Nguyen-Jutten algorithm for convolutional mixtures. The solution proposed consists in normalizing the adaptation terms of these algorithms by short-term estimates of the powers of the estimated sources. This may also be seen as a time-varying adaptation gain and makes the algorithms able to automatically track the time-varying levels (power values) of the sources. This is of special interest for such signals as speech, where xed-gain algorithms cannot combine fast convergence and good component extraction accuracy. Other related features result from this normalized approach, especially the ability to track time-varying mixtures. More details about this approach may be found especially in (Deville, Damour & Charkani, 1999), which introduces this approach in the linear instantaneous case and discusses the above case of unknown but xed source levels, and (Charkani & Deville, 1999a; 1999b), which present a much more complete approach for the convolutional case. It has also been proposed to take advantage of the assumed non-stationarity of the considered sources. Two aspects may be distinguished in this framework: We may still restrict ourselves to the case that the number of observations is at least equal to the number of sources, or consider the underdetermined case, i.e. the situation arising when the number of sources is larger than the number 10 of observations. In order to address the latter, more complex, case, the general concept of di erential source separation was introduced (Deville & Savoldelli, 2001). A somewhat related way to tackle ICA problems consists in using their time-frequency representations. One then takes advantage of assumed speci c properties of the time-frequency signatures of the sources, which appear for nonstationary sources. For linear instantaneous mixtures, a preliminary description of these methods appeared in (Abrard, Deville & White, 2001a; 2001b). 2.2 Relationships with projection pursuit It might be interesting to discuss the relationships among blind sources separation by the independent component analysis and the closely-related statistical theory of projection pursuit (PP) (Friedman & Tukey, 1974; Huber, 1985). While it is true that the mathematics of ICA and PP are very closely related in the basic case, the following di erences in the application, history, and also the theory of these methods deserve to be underlined:  Independent component analysis was originally devised to solve a concrete separation problem, whereas projection pursuit was originally developed for visualization;  In PP, the starting point is dimension reduction of a large-dimensional space. In ICA, the starting point is estimation of components or latent variables (basic signals), and this is meaningful even in two-dimension;  If the time-structure of the signals and the mixing system or signal model are taken into account, ICA is very di erent from PP. These points are clearly illustrated in one of the original motivations for ICA (Jutten & Heault, 1991): Two nervous bers transmit mixtures of signals related to joint stretch and stretch speed, and these two signals need to be separated by the central nervous system; nding interesting projections is hardly the proper approach in this case. Also, basically ICA searches simultaneously for many components with the criterion that they are mutually least dependent. On the other hand, PP extracts components sequentially with the criterion to be most interesting (non Gaussian). There is no penalty for extracting (nearly) the same component 11 again; instead, what is usually done is to subtract the extracted component from the data (through a regression) to avoid it being extracted again. However, if a component is not exactly extracted, it will contaminate all subsequent steps in PP; this problem is in fact encountered in sequential ICA as well. In the blind source separation by the ICA, the model of the data can be taken into account: After separation is completed, as a by-product we could also obtain an estimate of the mixing model. However, the latter operation is not troublefree; apart from numerical/precision/size problems, there are the well-known indeterminacies: The separation algorithm causes an arbitrary re-ordering of the source signals, and, due to the multiplicative nature of the model, scale factors and phase rotations (sign switch, in the real-valued case) appear. This re ects in some indeterminacy in the elements of the estimated model. These problems are fundamental, in the sense that there is no way to suppress them. In fact, in many algorithms, it is supposed that the sources have unit powers, so that the `true' powers result in mixing matrix column scaling. However, it is quite natural to suppose that if we know in advance that the mixing matrix has some special structure (i.e. induced by the physical properties of the problem at hand), it could be possible to remove some indeterminacy. In blind source localization by ICA, for instance, it is possible to solve for the mixing matrix, however, at some point the mixing matrix coecients become redundant as there will be fewer parameters describing the source positions than mixing matrix elements. Using a Bayesian methodology, one can derive more powerful algorithms that solve for these parameters rather than the elements of the mixing matrix. As mentioned, in this case, the algorithm provokes a crossover from the source separation problem to the problem of source localization. The two are intimately related: Depending on the prior information possessed and the information sought for, we obtain either a source separation algorithm, a source localization algorithm, or simultaneous source separation and localization algorithm. 2.3 A pathway to geometric integration In orthonormal independent component analysis (by pre-whitening) as well as in many unsupervised learning theories, the common way to design a learning algorithm arises from the following considerations. The lack of fundamental information in the engineering problems which require the use of unsupervised neural 12 systems is overcome by introducing concepts referring to high-order statistics, information theory and optimization. The three mentioned mathematical theories result to be closely related and widely employed in the literature in order to cope with blind signal processing problems. By making use of the concept of the information theory, in fact, it is possible to de ne suitable objective functions which describe in informational terms the processing problem under analysis. The mathematical statistics gives then some formal instruments for elaborating theses objective functions, and the optimization theory constitutes the analytical basis for synthesizing global functions containing eventual physical constraints arisen from the problem, and for determining learning algorithms endowed with the necessary requisites about convergence speed, steady-state precision, and sensitivity to external disturbances. The nal result of the preceding design work is often a set of learning differential equations, written in matrix form, whose solution gives the learning trajectory where the mentioned physical constraints denote what mathematicians refer to as invariants associated to the di erential equations. In order to respect the invariants, that is to preserve the qualitative characteristics of the solutions, the learning initial-value problems should be integrated numerically in a proper and ecient way. In particular:   Properness: The di erential equations should be integrated in a way that preserves the invariants both in order to ensure the quality of the signal processing solution provided by the neural system and to preserve some quantitative features of the learning theory such as intrinsic stability (for an expanded discussion see Fiori (2002d)); Eciency: An integration method that taks into account the structure of matrix-type expressions involved in the learning equations might possess contained computational complexity (Fiori & Celledoni, 2002). The proper mathematical theoretical setting for discussing and formalizing the mentioned topics is the Geometric Integration (GI). The classical e orts in numerical analysis have been to model physical phenomena into algorithms that produce suciently accurate and a ordable numerical approximations of their behavior. Geometric integration is concerned also with producing numerical approximations preserving the qualitative attributes of the solutions: Some examples of GI algorithms for di erential equations include Lie group integrators, 13 volume and energy preserving integrators, integrators preserving rst integrals, Lyapunov functions and Casimirs, as well as Lagrangean and variational integrators (Hairer, Lubich & Wanner, 2002). Preliminary and encouraging results in the eld of independent component analysis in presence of linear mixtures of real-world signals have been obtained recently as reported in (Fiori & Celledoni, 2002; Fiori & Rossi, 2002). 3 Remote Sensing and Synthetic Aperture Radar (SAR) Remote sensing is the science of acquiring information about material objects, areas, or phenomena, without coming into physical contact with the objects, areas, phenomena under investigation. In absence of direct contact, some mean of transferring information through space is necessary: In remote sensing, information transfer is accomplished by the use of electromagnetic radiation. On the basis of the physical type of used energy sources, we distinguish between passive remote sensing, which makes use of sensors that detect the re ected or emitted electromagnetic radiation from natural sources, and active remote sensing, which exploits sensors that detect re ected responses from objects that are irradiated from arti cially-generated energy sources, such as radar. A schematic of common remote sensing platforms with on-board radar sensors is depicted in the Figure 1. Remote sensing measures are recorded in digital form and then digitally processed in order to produce images for interpretation purposes. Variations in the scene's characteristics are represented as variations in brightness on recorded images, as usually a particular part of a scene which re ects more energy appears bright, while a part of the same scene that re ects less energy appears darker. Digital images consist of discrete pixels, whose intensity value represents the average radiance of a relatively small area within a scene, where the size of the elemental area a ects the reproduction of details within the scene; then, digital image processing techniques are necessary to retrieve the desired information from the raw images. Digital image processing is a collection of techniques for the manipulation of digital images by computers, which encompass the operations of image data manipulation and management, geometric and radiometric corrections, enhance14 Figure 1: Schematic of common platforms with radar sensors on board. (Reprinted, with permission, from www.gisdevelopment.net). ment of images quality, and information extraction. Image processing methods may be grouped into four functional categories: Image management, image restoration, image enhancement, and information extraction. These categories may be brie y described as follows:  Image management : Image acquisition and storage is the rst step in image analysis. In particular, pictorial data storing requires special attention due to the large size of this kind of data, that make it useful to perform some kind of compression and size-reduction in order to save memory supports. Usually natural images are highly redundant (correlated), i.e. parts of them can be reconstructed from the knowledge of few features; principal component analysis (or Karhunen-Loeve transform) is a widely known technique that allows representing real-world data by a small amount of uncorrelated features (for a recent review see e.g. Costa and Fiori (2001)).  Image restoration : The process of image acquisition may degrade the observed scenes because of measurement errors, as for example the blur e ect caused by a non-focused optical system, and of external causes, such 15 as the opacity of the mean that the images are acquired through. The images restoration procedures tend to ameliorate the quality of recorded scenes by removing noise and distortion e ects introduced by the imaging system(for a recent review see e.g. Kundur and Hatzinakos (1996)).   Image enhancement : Even if perfectly reproduced, a digital image may not yet be suitable for elaboration purpose, because its intrinsic quality does not allow to reveal the interesting features of the phenomena it corresponds to nor to emphasize the information content it brings on. To this purpose, image enhancement techniques have been developed; they introduce usually quite subjective warping of the original data in order to emphasize certain attributes, as brightness, to calibrate other attributes, as saturation, and equalize other features, such as hue (for a recent review see e.g. Fiori, Grimani and Burrascano (2002)). Information extraction : The most \intelligent" operation to be performed on the basis of restored/enhanced available data is the extraction of relevant information, that facilitates scene interpretation. The subject of feature extraction, ranking and selection is covered by pattern analysis research eld. In the present paper we focus our attention to radar-based remote sensing and, in particular, to synthetic-aperture radar. 3.1 Synthetic aperture radar Synthetic Aperture Radar (SAR) is an active microwave device, producing highresolution imagery of the Earth's surface (for a recent review see e.g. Hogda, Guneriussen & Lauknes (2002), Lou (2002), Brown & Bennett (2001), and Jao (2001)). Known past and present Earth observation satellites are the ERS1/ERS-2, JERS-1, the Shuttle imaging radar SIR-C/X-SAR, and RADARSAT. An example of their use is the ESA/Eurimage \Earthwatch " program, producing imagery of natural and arti cial disasters when weather conditions prevent other forms of surveillance from being pro table. Two important properties distinguish SAR from classical optical imagery:  The SAR is an active device : It generates its own illumination of the scene to be viewed. The illumination is coherent: All the light in any ash is 16 exactly in phase, as in a laser, so it does not disperse over the distance between the satellite and the Earth's surface. A SAR device can measure both intensity and phase of the re ected light, resulting highly sensitive to textures. Experiments with the technique of interferometry have shown that SAR can accurately model relieves, and appears also able to reveal small changes over time. Some consequences are that it works day and night as well, it can be used to gain additional information with respect to optical imagery, especially when di erent polarization are available on the same platform, it needs more power than passive sensors to be operated (and can therefore only operate intermittently), and that it su ers from speckle, an artifact of interference patterns in coherent light.  The SAR is a radar : It uses microwave frequency radiation, which penetrates cloud and haze, so it views the Earth's land and sea surface in all weather, that is the major advantage of SAR for general-purpose remote sensing. The rst implementation of radar interferometry came in Earth-based observations of Venus (Rogers & Ingalls, 1969). The rst reported experiments to determine terrain elevation of the Earth were by Graham (1974). Ten years later, interferometric radar experiments on the airborne system Convair-990 and on the spaceborne systems Seasat and SIR-B took place. Since 1990 the interest in SAR interferometry has grown due to the impressive amount of data suitable for interferometry from ERS-1 and the many airborne systems available such as the AIRSAR. SAR images are suitable for vegetation studies, as well as ocean waves, winds, currents, seismic activity and moisture content. In practice, by properly processing the complex SAR images, it is possible to obtain high-resolution topographic maps (5 m or less height resolution), measure very small (1 cm or less) Earth surface motion over large swaths, measure water surface currents (with an accuracy around 5 cm/s) and classify land surfaces. In particular, the following main applications are worth citing:  Ground topography : Studies on vulcanology and Earth surface motion related to di erential SAR interferometry are well known. ERS-1 helps the evaluation of digital elevation models with a grid spacing of about 50 m and a height accuracy of about 5 m. AIRSAR and Do-SAR allow 17 the evaluation of the digital elevation models with a grid spacing smaller than 10 m and a height accuracy around 1 m. The SAR-derived digital elevation models (DEM) have a big impact in the eld of the topography: They are replacing the stereo DEMs derived from optical systems (Hogda, Guneriussen & Lauknes, 2002).    Ocean surface current measurements : Ocean surface currents having speeds of less than 4 cm/s have been observed by Goldstein & Zebker (1987). Earth surface motion detection : Massonet (1993) shows the possibility of measuring the residual displacement caused by the earthquakes. The di erential interferometry is clearly validated for long term survey of slow faults (typically 10 mm/year measured with ERS-1). This is a remarkable result, because, due to the motion errors of the aircraft, it is very dicult to implement the di erential interferometry in an operational way. Land surface classi cation : By carrying out repeat-pass interferometry, coherence maps and change detection of SAR images can be used to provide properties of land surfaces. Results using ERS-1 data show the capability of the classi cation of forest, open elds, urban areas and open water (Lin, Alpers, Khoo, Lim, Lim & Kasilingam, 2001). An exemplary commercial SAR system is depicted in the Figure 2: It is a Predator Lynx synthetic aperture radar system. 3.2 Some details on imaging radar and SAR An imaging radar uses an antenna and a digital computer to store the acquired images. A radar image is generated only by the light that gets re ected back towards the antenna. Radar measures the strength and round-trip time of the microwave signals that are emitted by an antenna and re ected from a distant surface or object. Its antenna alternately transmits and receives pulses at particular microwave wavelengths (in the range of 1 cm to 1 m, which corresponds to a frequency range of about 300 MHz to 30 GHz) and polarizations (waves polarized in a single vertical or horizontal plane). About 1500 high-power pulses per second are transmitted towards the target, with each pulse having a pulse duration of 18 Figure 2: A Predator Lynx synthetic aperture radar system. (Reprinted, with permission, from www.lynxsar.com.) typically 10-50 microseconds. The pulse normally span a small band of frequencies, centered on the frequency selected for the radar; typical bandwidths are in the range 10 to 200 MHz. At the Earth's surface, the energy content of the incoming radar pulse is scattered in all directions, while only a fraction of it is re ected back towards the antenna. Such energy returns to the radar as a weaker radar echo and is received by the antenna in a speci c polarization (not necessarily the same as the transmitted pulse). Since the radar pulse travels at the speed of light, it is relatively straightforward to use the measured time corresponding to the roundtrip of a particular pulse to calculate the distance or range to the re ecting object. In the case of imaging radar, the radar moves along a ight path and the area illuminated by the radar, or footprint, is moved along the surface, building the image (see Figure 3). The chosen pulse bandwidth determines the resolution in the range direction (higher bandwidth means ner resolution in this dimension), while the length of the radar antenna determines the resolution in the azimuth direction of the image (the longer the antenna, the ner the resolution in this dimension). In an imaging radar, the term \aperture" means the opening used to collect the re ected energy. Because the radar is moving with respect to the ground, the returned echoes are Doppler-shifted (negatively when the radar approaches a target, positively when it moves away). Comparing the 19 Figure 3: Exemplary representation of SAR footprint. Doppler-shifted frequencies to a reference frequency allows returned signals to be \focused" on a single point, e ectively increasing the length of the antenna. SAR data focusing consists in correctly matching the variation in Doppler frequency for each point in the image; this operation requires a precise knowledge of the relative motion between the platform and the imaged objects. Some SARs can transmit pulses in either horizontal (H) or vertical (V) polarization and receive in either H or V modes, with the resultant combinations of HH (Horizontal transmit, Horizontal receive), VV, HV, or VH. Additionally, some SARs can measure the phase of the incoming pulse and therefore measure the phase di erence (in degrees) in the return of the HH and VV signals. This di erence is frequently retained as an indicator of structural characteristics of the areas or objects under observation. These SARs can also measure the correlation coecient for the HH and VV returns, which can be considered as a measure of how alike the portions of the areas or objects are. Radar images are composed of many dots, or picture elements. Each pixel in the radar image represents the radar backscatter for an area on the ground: Bright areas represent high backscatter (bright features mean that a large fraction of the radar energy was re ected back to the radar), while darker areas in the image represent low backscatter (dark features imply that very little energy 20 was re ected back to the antenna). Backscatter for a target area at a particular wavelength varies because of several conditions, as the size of the scatterers in the target area, the moisture content of the target area, the polarization of the pulses, the values of emitted wavelengths, and the observation angles. A rule that helps interpreting the radar images is that the brighter the backscatter on the image, the rougher the surface being imaged. Flat surfaces that re ect little microwave energy always appear dark in radar images. Vegetation is usually moderately rough on the scale of most radar wavelengths and appears as gray in a radar image. Some areas not illuminated by the radar, like the back slope of mountains, are in shadow, and appear dark. Roads and freeways are at surfaces so they appear dark. Backscatter is also sensitive to the target's electrical properties, such as water content: Wetter objects appear bright and drier targets appear dark (with the exception of smooth bodies of water, which behave as at surfaces and re ect incoming pulses away, thus they appear dark). Backscatter also varies depending on the use of di erent polarization and observations angles: Low incidence angles (perpendicular to the surface) will result in high backscatter, while it decrease with increasing incidence angles. 4 Application of ICA to SAR imagery processing and Experimental Results Over the next section it is shown how independent component analysis by neural networks may be advantageously employed to enhance remote sensing data on real-world SAR imagery. 4.1 Independent component analysis applications to remote sensing Very recently there has been interest in the use of independent component analysis methods for remote sensing. The use of principal component analysis for SAR imagery and remote sensing in general has been examined and well understood (Fiori & Piazza, 2000; Costa & Fiori, 2001; Cheng & Miller, 2002), while there has been much less work carried out about the use of ICA in image analysis. The existing contributions in this elds concern mainly the three following 21 topics:  Speckle reduction : A role played by the independent component analysis in synthetic aperture radar data processing is image speckle reduction: The SAR data can be considered as a mixture of target signals and some complex noises. ICA methods may be used to separate the noise from the signals as much as possible.  Feature extraction : Perhaps the most important role of independent component analysis in SAR imagery processing is feature extraction for classi cation. Selected independent components may be used to form feature vectors for classi cation. ICA has been found to perform better than PCA, as comparable classi cation percentage is achieved with less independent components than principal components (Chen & Zhan, 2000).  Data fusion : Another important role of independent component analysis is in SAR data fusion that allows to advantageously employ the data redundancy provided by radar measurements. When more sensors are available to observe the same area or objects on the best extracted component images may be merged using appropriate criteria. It has been shown that the image obtained can be more informative than the original images in any spectral band (Chen & Zhan, 2000). Other interesting notes on the theory and applications of neural ICA to remote sensing may be found on the recent book by Chicocki and Amari (2002). In the present paper, we consider the problem of noise removal by independent component extraction. The algorithms were implemented in Matlab language and ran on a 500MHz, 64MB machine. 4.2 SAR data and results of noise removal by ICA The considered image data is from an agricultural area near the village of Feltwell, United Kingdom and consists of 9 channels of SAR images: The data consist of three frequency bands (denoted, respectively, with c, p, and l); in each band there are three di erent polarizations (HH, HV, and VV); the available SAR images are depicted in the Figure 4. It is interesting to observe the di erences in the recorded images for the di erent frequency-bands: It clearly appears 22 c−HH c−HV c−VV l−HH l−HV l−VV p−HH p−HV p−VV Figure 4: Nine channels of SAR images of an agricultural area in the United Kingdom. that SAR is sensitive to di erent features of the observed areas depending on the frequency of the emitted electromagnetic waves. The practical application of ICA algorithms to these 2-dimensional data requires to perform the following pre-processing steps: 1. Each image is 250  250 pixels bitmap, that should rst be vectorized, thus the set of available images gives rise to a 9  62500 data-matrix; 2. The data-matrix may be linearly normalized in order to transform the range of pixel intensity from the natural domain into a conventional datadomain, for instance [,1 +1]; in the present case, this operation is not ; 23 necessary and is therefore omitted. 3. The normalized data should now be subjected to pre-whitening, which consists in removing rst-order and second-order statistics from the data. Some ICA algorithms explicitly require this operation to be performed, while other known algorithms do not need pre-whitening; however, it has been experimentally observed that whitening facilitates independent components extraction, because it facilitates convergence (Giannakopoulos, Karhunen & Oja, 1999); also, the analysis of the covariance matrix of data, and in particular of its spectrum, may reveal some interesting property about the data's information-structure dimension. 4. ICA applied to the whitened data allows extracting the desired number of independent components. In the present context the model of the data is linear, real-valued, and instantaneous. 5. Re-normalization of extracted components and restoration of the original 2-dimensional size of data allows then results visualization. In the present context we perform blind separation of a SAR image from noises, thus it is supposed that the obtained results have a recognizable meaningful. On the basis of the above-recalled algorithm, we proceeded rst to datapreparation and pre-processing. The whitening operation, in particular, is quite meaningful: The 9  9 data-covariance matrix has well-ordered eigenvalues, as clearly shown by the Figure 5; the fact that 4 eigenvalues out of 9 are signi cantly larger than the remaining eigenvalues leads to conclude that by projecting the data over the basis formed by the four principal eigenvectors a signi cant data-size reduction may be achieved with a negligible loss of information. The resulting transformed data-matrix has then the size of 4  62500. the smallest remaining (non-zero) eigenvalue is 375 06 and the largest remaining (nonzero) eigenvalue is 3536 62. The pre-whitening makes the covariance matrix of the whitened data an idenntity: In this case the actual covariance matrix of the reduced-size data di ers from an identity by a factor of 2 2  10,13. The whitened data have now been processed by three di erent independent component analysis algorithms: : : :  The dICA+ software (Frulloni & Fiori, 2002) is based on a cascade neu- ral network by Thawonmas et. al (1999); this method extracts the com24 4000 3500 3000 Eigenvalue 2500 2000 1500 1000 500 0 1 2 3 4 5 6 Eigenvalue order 7 8 9 Figure 5: Eigenvalues of the 9  9 SAR-data covariance matrix. ponents sequentially and iterates the whole data-stream several times in order for the active neuron in the cascade to properly extract each source signal.  The FastICA software is based on a generalized-Hebbian learning theory and xed-point optimization (Hyvarinen & Oja, 1997); it has been used in parallel mode.  The MEC learning rule is based on a rigid-body learning theory applied to kurtosis extremization (Fiori, 2002b); it extracts the independent components in a parallel way. The nal result of elaboration is shown in the Figure 6 for the dICA+, Figure 7 for the FastICA, and Figure 8 for the MEC-ICA algorithm: Four independent components where extracted, and one of them clearly appears as the true SAR image. In the three experiments, the extracted meaningful component exhibits a better quality than the original SAR images. In particular, for the dICA+ algorithm, the component IC1 looks the best (least noisy) one, while the others appear as complex noises. The algorithm autonomously ran for 83, 59, 32, 10 iterations of the data set for the rst, second, third and fourth neuron in the cascade, respectively, and a total computation time of about 2 minutes and 6 seconds. For the FastICA algorithm, the component IC2 looks the best one, the others being complex noises. The algorithm autonomously ran for 21 iterations 25 IC1 IC2 IC IC 3 4 Figure 6: Final result of ICA elaboration by the dICA+ software. 26 IC1 IC2 IC IC 3 4 Figure 7: Final result of ICA elaboration by the FastICA software. 27 IC1 IC2 IC IC 3 4 Figure 8: Final result of ICA elaboration by the MEC-ICA algorithm. of the whole data set in the parallel mode and a total computation time of about 1 minute and 36 seconds. For the MEC-ICA algorithm, the component IC4 looks the least noisy one. The shown results pertain to the use of just the rst 20 000 learning data out of the 62 500 available and the algorithm ran for about 16 seconds. The number of learning steps was chosen by observing the two curves characterizing the MEC-theory: The network's kinetic and potential energy (Fiori, 2002b); they are shown in the Figure 9. ; ; 5 Conclusion The aim of this work was to present an overview of independent component analysis technique based on neural networks, with the aim to provide a state-ofthe-art survey of the theoretical streams in this research area and a brief review 28 Network’s kinetic energy 2 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1 1.2 Learning step 1.4 1 1.2 Learning step 1.4 1.6 1.8 2 4 x 10 Networks’s potential energy 2000 1500 1000 500 0 0 0.2 0.4 0.6 0.8 1.6 1.8 2 4 x 10 Figure 9: MEC network's kinetic and potential energies during learning over the rst 20; 000 data-points out of the 62; 500 available. 29 of some applications of engineering interest. We also focused on a particular application dealing with synthetic aperture radar imagery processing, and brie y reviewed the features and main applications of synthetic aperture radar and showed how independent component analysis by neural networks may be of use in order to remove noise from these remote sensing data. References S.A. Abdallah and M.D. Plumbley (2001). If the independent components of natural images are edges, what are the independent components of natural sounds ?, Proc. of the International Workshop on Independent Component Analysis and Blind Separation (ICA'01), pp. 534 - 539 F. Abrard, Y. Deville, P. White (2001a). A new source separation approach based on time-frequency analysis for instantaneous mixtures, Proceedings of the 5th International Workshop on Electronics, Control, Modeling, Measurement and Signals (ECM2S'2001), pp. 259-267, Toulouse, France F. Abrard, Y. Deville, P. White (2001b). From blind source separation to blind source cancellation in the underdetermined case: a new approach based on time-frequency analysis, Proc. of the International Workshop on Independent Component Analysis and Blind Signal Separation (ICA'2001), pp. 734 - 739 F. Acernese, A. Ciaramella, S. De Martino, R. De Rosa, M. Falanga, and R. Tagliaferri (2002). Neural Networks for Blind Source Separation of Stromboli explosion quakes, IEEE Transactions on Neural Networks. Accepted for publication F. Acernese, F. Barone, R. De Rosa, A. Eleuteri, L. Milano, and R. Tagliaferri (2001). A neural network based identi cation system for VIRGO Seismic noise, Proceeding of IJCNN'2001, IEEE Press, Vol. 1, pp. 252 - 257 S.-i. Amari (1998). Natural Gradient Works Eciently in Learning, Neural Computation, Vol. 10, pp. 251 - 276 M. Andretta, A. Eleuteri, F. Fortezza, D. Manco, L. Mingozzi, R. Serra, and R. Tagliaferri (2000). Neural networks for sulfur dioxide ground level concentrations forecasting, Neural Computing and Applications, Vol. 9, pp. 93 100 S. Andreon, N. Capuano, G. Gargiulo, G. Longo and R. Tagliaferri (2000). Wide Field Imaging. I. Applications of Neural Networks to Object Detection 30 and Star/Galaxy Classi cation, Monthly Notices of the Royal Astronomical Society, Vol. 319, pp. 700 - 716 A.J. Bell and T.J. Sejnowski (1997). The `independent components' of natural scenes are edge lters, Vision Research, Vol. 37, No. 23, pp. 3327 - 3338 A.J. Bell and T.J. Sejnowski (1996). An Information Maximisation Approach to Blind Separation and Blind Deconvolution, Neural Computation, Vol. 7, No. 6, pp. 1129 - 1159 A. Belouchrani and J.-F. Cardoso (1995). MaximumLikelihood Source Separation by the Expectation-Maximization Technique: Deterministic and Stochastic Implementation, Proc. of International Symposium on Non-Linear Theory and Applications (NOLTA), pp. 49 - 53 S.C.M. Brown and J.C. Bennett (2001). Reduced bandwidth algorithm for near eld imaging of tree targets using synthetic aperture radar, IEE Proceedings - Radar, Sonar and Navigation, Vol. 148, No. 5, pp. 259 - 266 J.-F. Cardoso (1998). Blind signal separation: statistical principles, Proceedings of the IEEE (Special issue on \Blind identi cation and Estimation", R.-W. Liu and L. Tong, Ed.s), Vol. 90, No. 8, pp. 2009 - 2026 J.-F. Cardoso (1999). High-order contrasts for independent component analysis, Neural Computation, Vol. 11, No. 1, pp. 157 - 192 J.-F. Cardoso and B. Laheld (1996). Equivariant Adaptive Source Separation, IEEE Trans. Signal Processing, Vol. 44, No. 12, pp. 3017 - 3030 N. Charkani, Y. Deville (1999a). Self-adaptive separation of convolutively mixed signals with a recursive structure. Part I: stability analysis and optimization of asymptotic behaviour, Signal Processing, vol. 73, no. 3, pp. 225 254 N. Charkani, Y. Deville (1999b). Self-adaptive separation of convolutively mixed signals with a recursive structure. Part II: Theoretical extensions and application to synthetic and real signals, Signal Processing, vol. 75, no. 2, pp. 117-140 C.H. Chen and X. Zhan (2000). On the roles of independent component analysis in remote sensing, Proc. of Progress in Electromagnetics Research Symposium (PIERS'2000), Cambridge (MA, USA) J. Cheng and E. Miller (2002). Model-based principal component techniques for detection of buried landmines in multiframe synthetic aperture radar images, 2002 IEEE International Geoscience and Remote Sensing Symposium (IGARSS 31 '02), Vol. 1, pp. 334 - 336 A. Cichocki and J. Cao (1998). A self-adaptive neural network for on-line blind separation of convolved sources, Topics on Non-destructive Evaluation Series, Eds. B.B. Djordjevic and H.D. Reis, Vol. 3, The American Society for Non-destructive Evaluation Testing, Inc., pp. 207 - 212 A. Cichocki and R. Unbehauen (1996). Robust neural networks with on-line learning for blind identi cation and blind separation of sources, IEEE Trans. Circuit and Systems-I, Vol. CAS-43, pp. 894 - 906 A. Chicocki, J. Karhunen, W. Kasprzak, R. Vigario (1999). Neural Networks for Blind Separation with Unknown Number of Sources, Neurocomputing, Vol. 24, pp. 55 - 93, 1999 A. Chicocki and S.-i. Amari (2002), Adaptive Blind Signal and Image Processing, J. Wiley & Sons R.M. Clemente and J.I. Acha (1997). Blind separation of sources using a new polynomial equation, Electronics Letters, Vol. 33, No. 3, pp 176 - 177 P. Comon (1994). Independent Component Analysis, A new concept ?, Signal Processing, Vol. 36, pp. 287 - 314 P. Comon and E. Moreau (1997). Improved Contrast Dedicated to Blind Separation in Communications, Proc. International Conference on Acoustics, Speech and Signal Processing, pp. 3453 - 3456 S. Costa and S. Fiori (2001). Image Compression Using Principal Component Neural Networks, Image and Vision Computing Journal (special issue on \Arti cial Neural Network for Image Analysis and Computer Vision"), Vol. 19, No. 9-10, pp. 649 - 668 G. Darmois (1953). Analyse generale des liaisons stochastiques. Rev. Inst. Internat. Stat., Vol. 21, pp. 2 - 8 N. Delfosse and P. Loubaton (1995). Adaptive Blind Separation of Independent Sources: A De ation Approach, Signal Processing, Vol. 45, pp. 59 83 Y. Deville, J. Damour, N. Charkani (1999). Improved multi-tag radiofrequency identi cation systems based on new source separation neural networks, Proceedings of the International Workshop on Independent Component Analysis and Blind Signal Separation (ICA'99), pp. 449 - 454, Aussois, France Y. Deville, S. Savoldelli (2001). A second-order di erential approach for underdetermined convolutive source separation, Proceedings of the 2001 IEEE 32 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), Salt Lake City, USA, May 7-11 M. Dowd, P.W. Vachon, F.W. Dobson, and R.B. Olsen (2001). Ocean wave extraction from RADARSAT synthetic aperture radar inter-look image crossspectra IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 1, pp. 21 - 37 S. Fiori (1999). Entropy Optimization by the PFANN Network: Application to Independent Component Analysis, Network: Computation in Neural Systems, Vol. 10, No. 2, pp. 171 - 186 S. Fiori and F. Piazza (2000). A General Class of -APEX PCA Neural Algorithms, IEEE Transactions on Circuits and Systems - Part I, Vol. 47, No. 9, pp. 1394 - 1398 S. Fiori (2000a). Blind Signal Processing by the Adaptive Activation Function Neurons, Neural Networks, Vol. 13, No. 6, pp. 597 - 611 S. Fiori (2000b). Blind Separation of Circularly Distributed Source Signals by the Neural Extended APEX Algorithm, Neurocomputing, Vol. 34, No. 1-4, pp. 239 - 252 S. Fiori and P. Bucciarelli (2001). Probability Density Estimation Using Adaptive Activation Function Neurons, Neural Processing Letters, Vol. 13 No. 1, pp. 31 - 42 S. Fiori (2001a). A Theory for Learning by Weight Flow on Stiefel-Grassman Manifold, Neural Computation, Vol. 13, No. 7, pp. 1625 - 1647 S. Fiori (2001b). On Blind Separation of Complex-Valued Sources by Extended Hebbian Learning, IEEE Signal Processing Letters,Vol. 8, No. 8, pp. 217 - 220 S. Fiori (2002a), Complex-Weighted One-Unit `Rigid-Bodies' Learning Rule for Independent Component Analysis, Neural Processing Letters, Vol. 15, No. 3, pp. 275 - 282 S. Fiori (2002b). A Theory for Learning Based on Rigid Bodies Dynamics, IEEE Trans. on Neural Networks, Vol. 13, No. 3, pp. 521 - 531 S. Fiori (2002c). Hybrid Independent Component Analysis by Adaptive LUT Neurons, Neural Networks, Vol. 15, No. 1, pp. 85 - 94 S. Fiori (2002d). Unsupervised Neural Learning on Lie Group, International Journal of Neural Systems. Accepted for publication S. Fiori, F. Grimani, P. Burrascano (2002). Novel Neural Network Feature 33 Selection Procedure by Generalization Maximization with Application to Automatic Robot Guidance, International Journal of Smart Engineering System Design, 4(2), pp. 91 - 106 S. Fiori and E. Celledoni (2002). Neural Learning by Geometric Integration of Reduced `Rigid-Body' Equations. Unpublished Technical Report of the Center for Advanced Study, Oslo, Norway S. Fiori and R. Rossi (2002). Rigid-Body Learning Theory in the 3  3Network Case: Improvements by the Rodriguez Formula for the Exponential Map. Unpublished Technical Report of the University of Perugia, Italy J. Friedman and J.W. Tukey (1974). A projection pursuit algorithm for exploratory data analysis, IEEE Trans. on Computers, Vol. C-23, No. 9, pp. 881 - 889 E. Frulloni and S. Fiori (2002). Application of Neural Independent Component Analysis to ECT/NDE Data Processing. Submitted to a Journal X. Giannakopoulos, J. Karhunen, and E. Oja (1999). An Experimental Comparison of Neural Algorithms for Independent Component Analysis and Blind Separation, International Journal of Neural Systems, Vol. 9, No. 2, pp. 99 - 114 M. Girolami and C. Fyfe (1997). Extraction of independent signal sources using a de ationary exploratory projection pursuit network with lateral inhibition, IEE Proceedings - Vision, Image and Signal Processing, Vol. 14, No. 5, pp. 299 - 306 R.M. Goldstein and H.A. Zebker (1987). Interferometric radar measurement of ocean surface currents, Nature, Vol. 328, pp. 707 - 709 L.C. Graham (1974). Synthetic interferometer radar for topographic mapping, Proceedings of the IEEE, Vol. 62, No. 6, pp.763 - 768 M. Gusta son (1998). Gaussian Mixture and Kernel Based Approach to Blind Source Separation Using Neural Networks, Proc. of International Conference on Arti cial Neural Networks, Vol. 2, pp. 869 - 874 E. Hairer, C. Lubich, and G. Wanner (2002). Geometric Numerical Integration, Springer Series in Computationa Mathematics, Springer K.A. Hogda, T. Guneriussen, and I. Lauknes (2002). Synthetic aperture radar for DEM generation in snow-covered mountain terrain, 2002 IEEE International Geoscience and Remote Sensing Symposium (IGARSS '02), Vol. 4, pp. 2193 - 2195 34 J. Horstmann, W. Koch, S. Lehner, and R. Tonboe (2000). Wind retrieval over the ocean using synthetic aperture radar with C-band HH polarization, IEEE Transactions on Geoscience and Remote Sensing, Vol. 38, No. 5, pp. 2122 - 2131 P.J. Huber (1985). Projection pursuit, Annals of Statistics, Vol. 13, No. 2, pp. 435 - 475 A. Hyvarinen, J. Karhunen, and E. Oja (2001). Independent Component Analysis, John Wiley & Sons A. Hyvarinen and E. Oja (1997). A fast xed-point algorithm for Independent Component Analysis, Neural Computation, Vol. 9, No. 7, pp. 1483 1492 A. Hyvarinen and E. Oja (1998). Independent Component Analysis by General Non-Linear Hebbian-Like Rules, Signal Processing, Vol. 64, No. 3, pp. 301 - 313 A. Hyvarinen and E. Oja (2000). Independent component analysis: algorithms and applications, Neural Networks, 13 , 411 - 430 J.K. Jao (2001). Theory of synthetic aperture radar imaging of a moving target, IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 9, pp. 1984 - 1992 C. Jutten and J. Herault (1988). Independent Component Analysis Versus Principal Component Analysis, Proc. of European Symposium on Signal Processing (EUSIPCO), Vol. 2, pp. 643 - 646 C. Jutten and J. Herault (1991). Blind Separation of Sources, Part I: An Adaptive Algorithm Based on Neuromimetic Architecture, Signal Processing, Vol. 24, pp. 1 - 10 A.M. Kagan, Y.V. Linnik, and C.R. Rao (1973). Characterization problems in mathematical statistics. J. Wiley. J. Karhunen, A. Hyvarinen, R. Vigario, J. Hurri, and E. Oja (1997). Applications of Neural Blind Separation to Signal and Image processing. Proc. International Conference on Acoustics, Speech and Signal Processing (IEEEICASSP), Vol. 1, pp. 131 - 134 J. Karhunen, E. Oja, L. Wang, R. Vigario, and J. Joutsensalo (1997). A class of neural networks for independent component analysis, IEEE Trans. on Neural Networks, Vol. 8, No. 3, pp. 486 - 504 K.H. Knuth (1998). Bayesian source separation and localization, In A. 35 Mohammad-Djafari (Ed.), SPIE'98 Proceedings: Bayesian Inference for Inverse Problems, San Diego, pp. 147 - 158 D. Kundur and D. Hatzinakos (1996). Blind image deconvolution. IEEE Signal Processing Magazine, pp. 43 - 64. T.-W. Lee (1998). Independent Component Analysis - Theory and Practice, Kluwer Academic Publisher J.J. Legarsky, S.P. Gogineni, and T.L. Akins (2001). Focused synthetic aperture radar processing of ice-sounder data collected over the Greenland ice sheet, IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 10, pp. 2109 - 2117 I.-I. Lin, W. Alpers, V. Khoo, H. Lim, T.K. Lim, and D. Kasilingam (2001). An ERS-1 synthetic aperture radar image of a tropical squall line compared with weather radar data, IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 5, pp. 937 - 945 R.-w. Liu (1996). Blind Signal Processing: An Introduction, Proc. of International Symposium on Circuits and Systems (IEEE-ISCAS), Vol. 2, pp. 81 - 84 G. Longo, C. Donalek, G. Raiconi, A. Staiano, R. Tagliaferri, S. Sessa, F. Pasian, R. Smareglia, and A. Volpicelli (2002). Data Mining of Large Astronomical Databases with Neural Tools, Proceeding of SPIE Meeting Astronomical Telescopes and Instrumentation. In press G. Longo, R. Tagliaferri, S. Sessa, P. Ortiz, M. Capaccioli, A. Ciaramella, C. Donalek, G. Raiconi, A. Staiano, and A. Volpicelli (2001). Advanced Data Mining Tools for Exploring Large Astronomical Databases, Proceeding of SPIE Meeting N. 4477 on Astronomical Data Mining, SPIE, pp. 61 - 75 Y. Lou (2002). Review of the NASA/JPL airborne synthetic aperture radar system, 2002 IEEE International Geoscience and Remote Sensing Symposium (IGARSS '02), Vol. 3, pp. 1702 - 1704 D. Massonnet (1993). Validation of ERS1 Interferometry at CNES, Proceedings of the second ERS-1 symposium, Vol. 2, pp. 703 - 709, October 11-14, Hamburg, Germany J.P. Nadal, N. Brunel, and N. Parga (1998). Nonlinear Feedforward Networks with Stochastic Inputs: Infomax Implies Redundancy Reduction, Network: Computation in Neural Systems, Vol. 9, No. 2 S. Odewahn, E. Stockwell, R. Pennington, L.Q. Humphreys, and W. Zu36 mach. Automated star/galaxy discrimination with neural networks, Astronomical Journal, Vol. 103(1), pp. 318 - 331 E. Oja (1997). The nonlinear PCA learning rule in independent component analysis, Neurocomputing, Vol. 17, pp. 25 - 45 L. Parra, K.-R. Mueller, C. Spence, A. Ziehe, and P. Sajda (2000). Unmixing Hyperspectral Data, Advances in Neural Information Processing Systems (NIPS*12), pp. 942 - 948 B.A. Pearlmutter and L.C. Parra (1996). Maximum Likelihood Blind Source Separation: A Context-Sensitive Generalization of ICA, Proc. of Neural Information Processing System (NIPS*9), M.M. Mozer, M.I. Jordan, and T. Petsche Eds., pp. 613 - 619 D. Pham, P. Garrat, and C. Jutten (1992). Separation of a Mixture of Independent Sources Through a Maximum Likelihood Approach, Proc. of European Signal Processing Conference, pp. 771 - 774 A. Prieto and B. Puntonet (1998). A neural learning algorithm for blind separation of sources based on geometric properties, Signal Processing, Vol. 64, No. 3, pp. 315 - 331 C.G. Puntonet and B. Prieto (1998). Neural net approach for blind separation of sources based on geometric properties, Neurocomputing, Vol. 18, No. 1-3, pp. 141 - 164 S.J. Roberts (1998). Independent Component Analysis: Source Assessment & Separation, a Bayesian Approach, IEE Proceedings - Vision, Image and Signal Processing, Vol. 145, No. 3, pp. 149 - 154 A.E.E. Rogers, and R.P. Ingalls (1969). Venus: Mapping the surface re ectivity by radar interferometry, Science, Vol. 165, pp. 797 - 799 Z. Roth and Y. Baram (1996). Multidimensional Density Shaping by Sigmoids, IEEE Trans. on Neural Networks, Vol. 7, No. 5, pp. 1291 - 1298 B. Sagi, S.C. Nemat-Nasser, R. Kerr, R. Hayek, C. Downing, and R. HechtNielsen (2001). A Biologically Motivated Solution to the Cocktail Party Problem, Neural Computation, Vol. 13, No. 7, 1575 { 1602, July 2001 J. Schulz-Stellen eth, J. Horstmann, S. Lehner, and W. Rosenthal (2001). Sea surface imaging with an across-track interferometric synthetic aperture radar: The SINEWAVE experiment, IEEE Transactions on Geoscience and Remote Sensing, Vol. 39, No. 9, pp. 2017 - 2028 E. Seifritz, F. Esposito, F. Hennel, H. Mustovic, J.G. Neuho , D. Bilecen, 37 G. Tedeschi, K. Scheer, and F. Di Salle (2002). Spatiotemporal Pattern of Neural Processing in the Human Auditory Cortex, Science, Vol. 297, pp. 1706 - 1708 R. Tagliaferri, A. Ciaramella, L. Milano, F. Barone, G. Longo (1999). Spectral Analysis of Stellar Light Curves by Means of Neural Networks, Astronomy and Astrophysics Supplement Series, vol. 137, pp. 391 - 405 R. Tagliaferri, N. Pelosi, A. Ciaramella, G. Longo, F. Barone, and M. Milano (2001). Soft Computing Methodologies for Spectral Analysis in Cyclostratigraphy, Computers and Geosciences, Vol. 27, pp. 535 - 548 A. Taleb and C. Jutten (1997). Entropy Optimization - Application to Source Separation, Proc. of International Conference on Arti cial Neural Networks, pp. 529 - 534 R. Thawonmas, A. Cichocki, and S.-i. Amari (1998). A Cascade Neural Network for Blind Signal Extraction without Spurious Equilibria, IEICE Trans. Fundamentals, Vol. E81-A, No. 9, pp. 1833 - 1846 M. Welling and M. Weber (2001). A Constrained E.M. Algorithm for Independent Component Analysis, Neural Computation, Vol. 13, No. 3, pp. 677 689 L. Xu, C.C. Cheung, H.H. Yang, and S.-i. Amari (1998). Independent Component Analysis by the Information-Theoretic approach with Mixture of Densities, Proc. International Joint Conference on Neural Networks, (IEEEIJCNN), pp. 1821 - 1826 L. Xu, C.C. Cheung, and S.-i. Amari (1998). Learned Parametric Mixture Based ICA Algorithm, Neurocomputing (Special issue on Independence and Arti cial Neural Networks), Vol. 22, No. 1 - 3, pp. 69 - 80 H.H. Yang and S.-i. Amari (1997). Adaptive Online Learning Algorithms for Blind Separation: Maximum Entropy and Minimum Mutual Information, Neural Computation, Vol. 9, pp. 1457 - 1482 M. Yoshioka and S. Omatu, Signal Separation Method Using Genetic Algorithm, Proc. International Joint Conference on Neural Networks (IJCNN'98), pp. 909 - 912 L. Zhong, C. Wicks, J. Power, D. Dzurisin, W. Thatcher, and T. Masterlark (2002). Interferometric synthetic aperture radar studies of alaska volcanoes, 2002 IEEE International Geoscience and Remote Sensing Symposium (IGARSS '02), Vol. 1, pp. 191 - 194 38