[BOOK][B] Principal component analysis for special types of data

IT Jolliffe - 2002 - Springer
2002Springer
The viewpoint taken in much of this text is that PCA is mainly a descriptive tool with no need
for rigorous distributional or model assumptions. This implies that it can be used on a wide
range of data, which can diverge considerably from the 'ideal'of multivariate normality. There
are, however, certain types of data where some modification or special care is desirable
when performing PCA. Some instances of this have been encountered already, for example
in Chapter 9 where the data are grouped either by observations or by variables, and in …
The viewpoint taken in much of this text is that PCA is mainly a descriptive tool with no need for rigorous distributional or model assumptions. This implies that it can be used on a wide range of data, which can diverge considerably from the ‘ideal’of multivariate normality. There are, however, certain types of data where some modification or special care is desirable when performing PCA. Some instances of this have been encountered already, for example in Chapter 9 where the data are grouped either by observations or by variables, and in Chapter 12 where observations are non-independent. The present chapter describes a number of other special types of data for which standard PCA should be modified in some way, or where related techniques may be relevant. Section 13.1 looks at a number of ideas involving PCA for discrete data. In particular, correspondence analysis, which was introduced as a graphical technique in Section 5.4, is discussed further, and procedures for dealing with data given as ranks are also described. When data consist of measurements on animals or plants it is sometimes of interest to identify ‘components’ of variation that quantify size and various aspects of shape. Section 13.2 examines modifications of PCA that attempt to find such components.
In Section 13.3, compositional data in which the p elements of x are constrained to sum to the same constant (usually 1 or 100) for all observations are discussed, and in Section 13.4 the rôle of PCA in analysing data from designed experiments is described.
Springer