On decisions and information concerning an unknown parameter

James Korsh

INFORMATION AND CONTROL 16, 123-127 (1970) On,Decisions and Information Concerning an Unknown Parameter JAMES F. KORSH University of Pennsylvania, The Moore School of Electrical Engineering, Philadelphia, Pennsylvania 19104 Let 2 1 , )22 .... be a sequence of random variables whose finite dimensional distributions depend on a random variable 0. We study the error probability and equivocation of specific decision functions which are used to decide on 0 based on a sequence of n observations of the {Xn} process. In particular, we show that if the process is ergodic for each value of O, the error probability and equivocation go to zero as n goes to infinity. If the {X~}process is a Markov chain with distinct state behavior for each value of O, then they approach zero exponentially. INTRODUCTION Let X 1 , X2 .... be a sequence of random variables. Suppose their finite dimensional distribution functions depend on a parameter 0 which is also a random variable. I n this paper it is assumed that 0 may take one of M values, say v i , v~ ..... vM, while the Xn have finite discrete distributions. Let Ps(~:~) denote the conditional distribution function of ~:n = (X1, X2 .... , X . ) given that 0 = vj and assume further that Pj(s¢l) is not identical with Ph(~l) for M h :/=j. Define H(O/~) to be 2j=1 P(v/~n)log P(vj/se~), the average uncertainty about 0 given the sample ~ . T h e equivocation about 0 is then E[H(O/~)] where E denotes the expectation of the random variable H(O/~). We assume specific decision schemes are used to decide on the value of 0 after observing ~n, and investigate the behavior of the error probability and equivocation with n. I n particular, we show that if the {X~} process is ergodic for each value of O, then the error probability and equivocation go to zero as n approaches infinity. If the {X~} process is a Markov Chain with distinct state behavior for each value of 0 then they approach zero exponentially with n. This result generalizes that of Renyi for the case where the X~ are independent and identically distributed given the value of 0. These results 123 124 KORSH imply that the Bayes risk under a bounded loss function must approach 0 as n - + oo in the ergodic case, with this convergence being exponential in the Markov Chain case. General Case By the lemma on page 16 of Feinstein, E[H(O/~+I)] ~ E[H(O/~n)]. Also, since 0 may assume only a fixed finite number of values, H(O/~) is bounded for all n and thus E[H(O/~n)] will be too. Consequently, t h e s e q u e n c e of equivocations {E[H(O/~n)]} must converge as n approaches infinity. However, it is not difficult to construct examples for which the limit is not zero when the only assumption is that the (Xn} process is stationary under the condition t h a t 0 = v 3for 1 ~ j ~ M . Ergodic Case Let Yk = [P~(Xk)/P~(Xk)] ~ for 0 < ~ < 1, Assume that 0 = v~ ; h @ j, and the {Xn} process is ergodic under the condition that 0 = vh. By the ergodic theorem, 1/n ~=1 Yk converges to Eh[Ya] with Ph probability one. Here Eh denotes conditional expectation given 0 = vh. Now Eh[Y1] = Zu PJ~(Y) P~-~(Y) = h~ which is a convex function of for 0 ~ ~ ~ 1. At ~ = 0 and ~ = 1 it has the value 1 and, thus, assumes a minimum value with respect to a that is less than one. Actually, h~ < 1 n for all 0 < ~ < 1 by Holder's inequality. Thus, when 0 = v~, 1In Zk=l Y~ approaches h~ < 1 almost surely with respect to Ph f o r j ~ h. Consider the following decision scheme: decide 0 = vh if (a) 1/n ~=~ [Pj(X~)/Ph(Xk)] ~ < 1 for all j ~ h and (b) the h of condition a is unique. This scheme yields a correct decision unless events a and b do not occur when 0 = vn. Thus, P(error/O = vh)-~ P(A ~ or Be~O= vl~) P(A~/O = vh) + P(B~/O = vh). We h a v e seen above that P(A~/O = vh) approaches 0 as n --~ or. Now, since x -1 is convex for x > 0, [ P~(XO 1" tt[ P,(X~) "1~"~-~t G t P~(X1) J = Ea t~t~-h(X~- J ] 1 ----a~>l for [ P~(X~) 1~'t-~ ) >~ {Eh L Pa(X1) J t l=/:h. Thus, again by the ergodic theorem, 1/n ~.,~=1[Ph(Xk)/Pz(X~)] ~' approaches INFORMATION CONCERNING AN UNKNOWN PARAMETER 125 a limit which is greater than 1 almost surely with respect to Pn ° Consequently, P(Bc/O = %) approaches 0 as n --~ oo. Thus, the probability of an error under this scheme, which is M E P(O = %) e(error/O = %) h=l goes to 0 as n --~ oo. In fact, from some point on all decisions will be correct with probability one. By the theorem on p. 35 of Feinstein it follows that, E[H(O/~,~)] <~ --P~ log P~ - - (1 -- P~) log(1 - - P~) + P~ log(M - - 1), where Pe is the probability of error of the above scheme. But Pe goes to zero as n - + oo so that the equivocation must also. We, thus, have the following theorem. THEOREM 1. The above decision scheme yields a probability of error and an equivocation which approach 0 as n approaches oo when the {Xn} process is conditionally ergodic. Marhov Chain Case Consider the following decision scheme: decide 0 -= % if Ph(~.) > Pj(~n) for all j :/: h. Then the probability of an error, P~, is given by M P(O = %) P(error/O = %). h=l Or, Pj(~'~) >/ 1 f o r s o m e j 4: h/O = %I Po = YM p(o = v ~ ) e 1 Ph(~.) h=l M h=l ¢#z~ for (\ Pn(~n) ] By Markov's inequality, M (p~(~,,) ~, P~ ~ ~, P(O = %) ~, E~ \ Ph(~n) 1" h=l j~h 0 <oe< I. 126 KORSH Now, Eh \ Ph(~:~) ] = 2 PJ ( - ) Ph (~:n) fn = 2 PJ"(xl) Pi-"(X~) ~. Ps"(X2/X~) P~-"(X2/X~)"" xl X2 Dl-~t X IX •" Y x~ = a[B~(~) C~-lta~lrjh V~JJ with a all l's under the assumption that the {Xn} process is a Markov Chain under the condition that 0 = % . Here 1 Bjh(o~) is the vector Pj~(X1)P~-~(X1) and C~n(O~)is the matrix whose i, kth entry is P~(xi/xk) P~-"(xi/xk) where x i and x k range over the possible values of X n . Suppose for each x, Pj(y/x) ~ Pn(y/x) for all y. Then maxz,je~ Ev Pfl(y/x)P~-~(y/x) < 1 for 0 < a < 1. Consequently, max~.3~min0<~<lAjh(~ ) = q is less than 1 where Ajh(c~) is the largest eigenvalue of Cj~(e). Thus, Pe goes to 0 exponentially with q. That is, there exists a constant A such that Pe <~ Aq n for all n. Since E[H(O/~)] = ~ = 1 P(O = vh) Eh[H(O/~n)] it follows from the lemma of Renyi that M ,4 /P(o = M = c 2 Y X/P(O = vj) P(O = %) ~ Pj½(~n) P~(~n) h=l j:C:h M <. C E ~ VF(O = vj) P(O = %) A'~ where a=q}. h=l jvSh < ~ C ( M - - 1 ) An where h<l. Thus, we obtain the theorem below. THEOREM 2. The above decision function yields a probability of error and an equivocation which approach 0 exponentially with n when the {X~} process is conditionally a Markov Chain with distinct state behavior. 1 The author would like to thank a referee for this version of the proof. INFORMATION CONCERNING AN UNKNOWN PARAMETER 127 I n conclusion, it follows that Theorems 1 and 2 must apply to Bayes decision functions also in the case of bounded loss functions. Consequently, the corresponding Bayes risks will behave as the error probability of the decision function specified here. RECEIVED: January 20, 1969; revised: October 7, 1969 REFERENCES 1. A. FEINSTEIN,"Foundations of Information Theory," McGraw-Hill Book Co., Inc., New York, 1958. 2. A. RENYI, On the amount of information concerning an unknown parameter in a sequence of observations, Publ. Math. Inst. Hung. Acad. Nci. 9, 617-625. 3. A. RENYI, "On the Amount of Missing Information and the Neyman-Pearson Lemma," in Research Papers in Statistics, F. N. David (Ed.), John Wiley and Sons, London, 1966.

RELATED PAPERS

RELATED TOPICS

Log In

On decisions and information concerning an unknown parameter

On decisions and information concerning an unknown parameter

Related Papers

RELATED PAPERS

RELATED TOPICS