Handout 2 Multivariate

Statistics Multivariate Methods random variables

Uploaded by

kamwebazemarkk

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

17 views

Handout 2 Multivariate

Statistics Multivariate Methods random variables

Uploaded by

kamwebazemarkk

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 10

SIXTH EDITION Applied Multivariate Statistical Analysis RICHARD A. JOHNSON University of Wisconsin—Madison DEAN W. WICHERN Texas A&M University Upper Saddle River, New Jersey 07458‘The Organization of Data 5 The preceding descriptions offer glimpses into the use of multivariate methods in widely diverse fields. 1.3 The Organization of Data ‘Throughout this text, we are going to be concerned with analyzing measurements made on several variables or characteristics. These measurements (commonly called data) must frequently be arranged and displayed in various ways. For example, graphs and tabular arrangements are important aids in data analysis. Summary numbers, which quantitatively portray certain features of the data, are also necessary to any description. We now introduce the preliminary concepts underlying these first steps of data organization. Arrays Multivariate data arise whenever an investigator, secking to understand a social or physical phenomenon, selects a number p = 1 of variables or characters to record. ‘The values of these variables are all recorded for each distinct item, individual, or experimental unit. We will use the notation x;, to indicate the particular value of the kth variable that is observed on the jth item, or trial. That is, jx = measurement of the kth variable on the jth item Consequently, measurements on p variables can be displayed as follows: Variable 1 Variable2_--- Variable k Variable p Item 1: a 2 : Lik Xp Item 2: Xy x2 : Xan os 2p Item xa ta See Aik ce *ip Item n: Xn no ms eae eeueeetarge Or we can display these data as a rectangular array, called X, of n rows and p columns: A 2 Me %21 X22 71" X2e Zip Pete eae akc: Xn Xn2 * Xnk “77 Xap. ‘The array X, then, contains the data consisting of all of the observations on all of the variables.66 Chapter 2 Matrix Algebra and Random Vectors Thus, 1 eet 221) Pat since (PA“'P')PAP’ = PAP’(PA™'P’) = PP’ =I. Next, {et A¥? denote the diagonal matrix with Vi; as the ith diagonal element, k The matrix $) Vi; ee) = PAP’ is called the square root of A and is denoted by frst av, The square-root matrix, of a positive definite matrix A, k Al? = > Vij ee! = PAYPY (2-22) A has the following properties: 1, (A')' = Al? (that is, A’? is symmetric). 2 APA = A ‘ 3. (a?) T= > ve ee; = PA“!P’, where A~Y? is a diagonal matrix with ai vi, 1/ Vi; as the ith diagonal element. 4. NPAT = VAM = Tand AA? = 1 where AY? = (AN2)7, 2.5 Random Vectors and Matrices A random vector is a vector whose elements are random variables. Similarly, a random matrix is a matrix whose elements are random variables. The expected value of a random matrix (or vector) is the matrix (yector) consisting of the expected values of each of its elements. Specifically, let X = {X;;} be an n x p random matrix. Then the expected value of X, denoted by £(X), is the n x p matrix of numbers (if they exist) E(X1) E(X12) -- E(X1p) EX) (Xn) E%ay) E(X) = (2-23) BXn) E(Xea) - E%ny)Random Vectors and Matrices 67 where, for each element of the matrix? ; [ ay fit), Xi is continuous random variable with co probability density function f,(x:;) E(Xi) = if X,)is a discrete random variable with D xpi) probability function pi,(.xi;) alts, Example 2.12 (Computing expected values for discrete random variables) Suppose p =2andn = 1, and consider the random vector X’ = [X;, X2]. Let the discrete random variable X; have the following probability function: an ele Oaa Pix) ea Then E(X1) = SY nipa() = (-1)(3) + (0)(3) + (1)(4) = A i Similarly, let the discrete random variable X, have the probability function x2 ee PAx%2) | 8 2 Then £(X2) = = %2P2(*2) = (0)(.8) + (1) (2) = Thus, E(x,)]_ [a E(X) = = ) oS 2 : ‘Two results involving the expectation of sums and products of matrices follow directly from the definition of the expected value of a random matrix and the univariate properties of expectation, E(X, + %4) = E(X,) + E(¥j) and £(cXy) = cE(%)- Let X and Y be random matrices of the same dimension, and let A and B be conformable matrices of constants. Then (see Exercise 2.40) = E(X) + E(¥) (2-24) AE(X)B It you are unfamiliar with calculus, you should concentrate on the interpretation of the expected value and, eventually, variance. Our development is based primarily on the properties of expectation rather than its particular evaluation for continuous or discrete random variables.68 Chapter 2 Matrix Algebra and Random Vectors 2.6 Mean Vectors and Covariance Matrices Suppose X’ = [X1, X2,..., Xp] isa p x 1 random vector. Then each element of X is a random variable with its own marginal probability distribution. (See Example 2.12.) The marginal means 1, and variances g? are defined as p; = E(X;) ando? = E(X; — i)?, i= 1,2,..., p, respectively. Specifically, [ aif) dr, i€X:is a continuous random variable with probability Loo density function f(x) a if X;is a discrete random variable with probability = xiPi(%i) function pj(x;) a, [ (x; — w)?f(x) dx; EX is acontinuousrandom variable (25) with probability density function fi(x;) 2 oe : if X; is a discrete random variable Mi)’ Pi(%i) with probability function p,(x;) Z(t It will be convenient in later sections to denote the marginal variances by ;, rather than the more traditional 07, and consequently, we shall adopt this notation. The behavior of any pair of random variables, such as X; and X;, is described by their joint probability function, and a measure of the linear association between them is provided by the covariance in = E(Xi ~ aa) (Xe ~ Be) f[ [ (x; — mi) (24 — madfiules, xx)de; dry if X;, Xeare continuous 00 J-c0 random variables with the joint density function fix().%%) (xy Ha) re ~ ma) Pin 2H 4) if X;, X; are discrete atts ats, random variables with joint probability function pja( Xis-*4) (2-26) and 1; and iy, i,k = 1,2,..., p, are the marginal means When i = k, the covariance becomes the marginal variance. More generally, the collective behavior of the p random variables X1, X2,....Xp or, equivalently, the random vector X' = [X,, X2,-.., Xph, is described by a joint probability density function f(x;, x2,...,xp) = f(x). As we have already noted in this book, f (x) will often be the multivariate normal density function. (See Chapter 4.) Ifthe joint probability P[X; = x, and X, = x,] can be written as the product of the corresponding marginal probabilities, so that P[X; S xjand X_ S xq] = P[X; < xJP[Xt < x4) (227)Mean Vectors and Covariance Matrices 69 for all pairs of values x;, xx, then X, and X; are said to be statistically independent. When X; and X; are continuous random variables with joint density fi,(x;, x4) and marginal densities f,(x;) and f;(x,), the independence condition becomes Sue Xu) = fie Sa Xe) for all pairs (x;, x,). The p continuous random variables X,,Xp,...,Xp are mutually statistically independent if their joint density can be factored as Fir -p(21s Xay-++ Xp) = Alea)fal%2) > Sp(%p) (2-28) for all p-tuples (x, x2,....Xp)- Statistical independence has an important implication for covariance, The factorization in (2-28) implies that Cov (X;, X,) = 0. Thus, Cov(X;,X,) = 0 if X;and X, are independent (2-29) The converse of (2-29) is not true in general; there are situations where Cov(X;,X;) = 0, but X; and X; are not independent. (See [5}.) The means and covariances of the p X 1 random vector X can be set out as matrices. The expected value of each element is contained in the vector of means = E(X), and the p variances ;; and the p(p — 1)/2 distinct covariances oiu(i 2) pram 2) all pits (ri.x2) = (-1- .1)(0 ~ .2)(.24) + (-1 ~~ .1)(1 ~ .2)(.06) ++ (1 ~ 1)(1— .2)(.00) = -.08 oa, = E(X2 — ba)(%1 ~ on) = E(X1 ~ o1)(X2 ~ ba) = O12 = —.08‘Mean Vectors and Covariance Matrices 71 Consequently, with X’ = [X,, Xo], wane = [E08] [in] [2] = E(X — #)(X - pw)’ and i | ~ my (X= a) - | (42 = wa)(X = oa) (Xa ~ pa)? - ea ~ wm? E(X, ~ w)(X2 — 2 (Xa ~ wa) ~ a) E(X2 ~ wa)? -[ou on]_[ 69 ~08 On 022 -08 16 ia We note that the computation of means, variances, and covariances for discrete random variables involves summation (as in Examples 2.12 and 2.13), while analo- gous computations for continuous random variables involve integration. Because oj, = E(X;~ u,)(Xz —~ ux) = oxi, it is convenient to write the matrix appearing in (2-31) as Pe E=E(X — p(X — py =| 7 Oe (2-32) ieee ee We shall refer to 4 and & as the population mean (vector) and population variance-covariance (matrix), respectively. The multivariate normal distribution is completely specified once the mean vector #2 and variance-covariance matrix ¥ are given (see Chapter 4), so it is not surprising that these quantities play an important role in many multivariate procedures. It is frequently informative to separate the information contained in variances ¢;; from that contained in measures of association and, in particular, the measure of association known as the population correlation coefficient pj,. The correlation coefficient pix is defined in terms of the covariance oj, and variances oj; and o44 as ik = 2-33 oo (2-33) The correlation coefficient measures the amount of linear association between the random variables X, and X;. (See, for example, [5].)72. Chapter? Matrix Algebra and Random Vectors Let the population correlation matrix be the p X p symmetric matrix ce Voi1 VOpp Bei ePH (O22 VOpp 1 pe Pip | Pigeed Pop Pip Pop = 1 and let the p X p standard deviation matrix be Vay 0 + 0 yr} 9 Vor 0 00 ve, ‘Then it is easily verified (see Exercise 223) that Viepv? = and p= (vie svi2yt (234) (2:35) (2:36) (2:37) That is, can be obtained from V'? and p, whereas p can be obtained from ¥. Moreover, the expression of these relationships in terms of matrix operations allows the calculations to be conveniently implemented on a computer. Example 2.14 (Computing the correlation matrix from the covariance matrix) ‘Suppose 4 1 2 On %12 O13 r=/1 9 -3/=]e, 022 O23 2-3 25] Les on 035 Obtain V'? and p.‘Mean Vectors and Covariance Matrices, 73 Here Von 0 0 200 vVe2=!| 0 Von 0 |=|0 3 0 0 0 Voss 005 and $00 (vy? =]0 40 0 0 3 Consequently, from (2-37), the correlation matrix p is given by is [ S007] (eaeaie 21] (5200, qi2ytx~wi2y7=!0 5 off1 9 -3}]0 $0 oo $jl2 -3 aslo o } Tn el 1 lies tacas cat Partitioning the Covariance Matrix Often, the characteristics measured on individual trials will fall naturally into two or more groups. As examples, consider measurements of variables representing consumption and income or variables representing personality traits and physical characteristics. One approach to handling these situations js to let the characteristics defining the distinct groups be subsets of the fotal collection of characteristics. If the total collection is represented by a (p X 1)-dimensional random vector X, the subsets can be regarded as components of X and can be sorted by partitioning X. In general, we can partition the p characteristics contained in the p X 1 random vector X into, for instance, two groups of size g and p ~ q, respectively, For example, we can write