IEEE TRANSACTIONS ON RELIABILITY, VOL. R-24, NO. 2, JUNE 1975
108
Reliability Models of NMR Systems
Francis P. Mathur, Member IEEE
Paulo T. de Sousa, Member IEEE
Abstract-Majority voted redundant systems are widely used. A reliability model is developed and analyzed for N-tuple Modular Redundancy-NMR: (n + I)-out-of-2n + 1)- where the units are subject to
stuck-at-0, stuck-at-1 or stuck-at-X failures and where failures can occur
in a mutually compensatory manner. A reconfiguration of the NMR
redundancy, the NMR/Simplex strategy, is proposed and evaluated and
its model shown to be included in the general model for the compen-
sated NMR.
Reader Aids:
Purpose: Widen state of the art
Special math needed for explanations: Probability, combinatorial
analysis
Special math needed for results: Same
Results useful to: Theoretically inclined reliability engineers, designers
of fault-tolerant computers.
Ro
N
n
exp (-XT).
Total number of active redundant units at the
beginning of the time interval of interest;
(N = 2n + 1, N > 1).
T
t or z
Simplex
TMR
NMR
(TMR)sim,
(NMR)sim
(TMR)comp,
(NMR)Comp
R("System
problemancnbeoitde.,
(T O).
mmyvles
Dummy variables for time; (0. t, z S 1)
A nonredundant unit or system, (N = 1).
Triple Modular Redundant system, (N = 3).
N-tuple Modular Redundant system.
TMR/Simplex system,
NMR/Simplex system.
Compensated TMR system,
Compensated NMR system.
The format of a compact notation for simplify-
Characteriza- ing the writing of reliability equations.
tion")
["Time"]
Here
"R" the reliability is followed in parentheses by
the "System Characterization" such as (NMR),
R(NMR) [T]
(TMR), or (Simplex) and is then succeeded in
square brackets by the parameter "Time". The
parameter "Time" is usually the mission time T
and can be omitted; e.g.,
is the reliability of an NMR system for a mission
1. INTRODUCTION
1.1 Statement of the
Degree of active redundancy, N = 2n + 1 in
majority logic; (n > 0).
Mission time; (T> 0).
The use of protective redundancy to enhance reliability hasduainT
R
Relability of the voter
found wide acceptance as a fundamental procedure [2, 3, 9].
v
Majority voted systems are among the best known redundant
Module failure, where the output is stuck at a
structures. Triple Modular Redundancy (TMR) was the earliest Stuck-at-I
constant logical 1.
of these systems [10]. The simplex unit is triplicated and each
Module
Stuck-at-0
failure, where the output is stuck at a
of the three independent units feed into a 2-out-of-3 voter. The
constant logical 0.
system fails if more than one unit fails. Variations of this stratModule failure, where the output is indetermiegy have been developed, such as the TMR/Simplex [1, 6] . N- Stuck-at-X
nate, viz., not stuck at a constant logical value.
tuple Modular Redundancy (NMR) is a generalization of TMR
Pr {stuck-at-1 module fails }
[8]. The simplex unit is replicated N times (N = 2n + 1 with n PI
PPr {stuck-at-0 module fails}
an integer). At least n + 1 out of the units have to be operaPr {stuck-at-X module fails
Px
tional for the structure to survive.
This classical interpretation of NMR systems underestimates
their reliability. A majority of units can fail and the system
2. BACKGROUND AND ARCHITECTURAL CONCEPTS
will still survive. These cases are called compensating failures.
Their consideration provides a more realistic evaluation of the
2.1 Compensating Failures
NMR reliability.
Section 2 introduces the compensating failures approach
System reliability is the sum of the probabilities of the
and generalizes the TMR/Simplex concept. Section 3 develops
mutually exclusive success paths through the system [91 . The
the mathematical equations to model these -systems, and furreliability of a TMR system with an infallible voting device is
ther analysis is undertaken in section 4.
The remainder of section 1 gives notation and nomenclature. (under classical assumptions [2, 3, 8] ):
1.2 Nomenclature and Notation
X
Constant failure rate of a nonredundant active
unit; (X2> 0). It includes stuck-at-0, stuck-at-I
and stuck-at-X failures.
R(TMR) =3R - 2R3.-
(1)
This classical model is an underestimation [4] ,because it does
not consider all the nonfailure situations. It only considers
two success paths:
109
MATHUR AND de SOUSA: RELIABILITY MODELS OF NMR SYSTEMS
M-m M-il
1) all units survive mission time T
=
IM.
2) one unit fails and two units do not fail during mission Pr{m-comp-M }
I
io
il
=m =m il !io!(M-il -i0)
time T.
A third successful event can happen when a unit fails to a
i
io M-ilio
(5)
Pi Po Px
stuck-at-0 (stuck-at-i) situation between the times 0 and T.
The system does not fail, because the majority of the units did
These results will be applied in section 3.2 to develop the
not fail. Suppose that there is now a second failure during the
mission time. If this second unit fails to a stuck-at-I (stuck-at- reliability expression of a compensated NMR system.
0) situation, it compensates the first one. The output will be
given correctly by the third nonfailed unit. The probability of 2.2 NMR/Simplex
occurrence of this third success path should be added to (1).
The variant of the TMR scheme called TMR/Simplex has
It is:
been analyzed in [6]:
3RO(I -R0)2 - Pr{2 units compensate each other 1 2 units
(6)
R(TMR)sim [T] = 1.5 R o-O5R0o
fail }
M
The generalization of this scheme from the TMR/Simplex to the
NMR/Simplex structure will now be shown. The notation NMR/
Simplex is an abbreviation for a sequence such as NMR/(N - 2)
MR/ ...TMR/Simplex.
Initially there is an odd number N of units operating in a
voted system. Whenever a unit fails, that unit as well
=
_E
majority
R(NMR)
R' (I-R)-Ni
(2)
as one of the remaining good units are discarded. From that
i=n+l (iY)
moment, the system is in a (N - 2)MR/Simplex mode. This
Equation (2) is the sum of the probabilities of all the cases
where at least n + 1 (a majority among the N units) of the rep- process will repeat itself and eventually will lead to a simplex
system. An expression for the reliability of such a system will
licas will survive. Equation (2) is just a lower bound, since in
be derived in section 3.1. In section 4.3 it will be shown that
many cases the failures can compensate one another. A more
such an expression is a particular case of the compensated NMR
general NMR reliability model will be developed.
In general, three types of failures can be defined: stuck-at-i, reliability.
stuck-at-0 and stuck-at-X. Since these three are the only types
3. MODELING
of failures considered:
These results will be generalized to the NMR case.
The N-tuple Modular Redundant design consists of N replicated units feeding a (n + 1)-out-of-N voter and has a reliability:
Pi +Po +Px = I .
(3)
It is assumed that an indeterminate failure such as stuck-at-X
cannot compensate a determinate failure such as stuck-at-O or
stuck-at-i.
Definition: Two failed units are said to be compensated if
one of them is stuck-at-0 and the other stuck-at-i. This situation is called one-compensation-out-of-two modules and abbreviated as "l-comp -2".
In a NMR structure where two modules fail in a compensating manner the system becomes effectively a (N - 2)MR
structure.
Definition: There are m-compensations-out-of-M failed units
(m-comp-M) if among the M failed units there are m pairs of
compensated modules.
This phenomenon is governed by a multinomial distribution
with pdf:
~i !io!i! Pi 'Po°Pxx
(4)
3.1 NMR/Simplex
In a NMR/Simplex redundant system there are two cases
leading to mission success:
Case 1: All units survive the mission time T. The probability of this event is RO [T].
Case 2: One unit fails at time z E (0, T); that unit and
another one are discarded. The probability of this event is:
NfTXeXZ
e-(N1)Xz
o
R((N-2)MR)sim [T-z] dz.
The reliability equation for the system is then:
(7)
R(NMR)sim [TI = RN + NA f TeNXz R((N-2)MR)sim
[T-z] dz
=RON ± NARN
rTeXNt R((N- )MR)s4
[t] dt .
(8)
(9)
Iis shown in the Appendix that this recursive integral equation
(9) has the solution:
n
R(NMR)sim [T]=AAn jEB. R2i+i1;
where ti, io and ix are the number of units respectively stuckat-i, stuck-at-0, and stuck-at-X; ii + io + ix M.
The probability of having mn-comp-M is the sum of the probabilities of all the cases where there are at least m units stuckAn(2n + 1) (2,n)I22n,B1n
at-i and m units stuck-at-0, i.e.:
(n)(-l)I/(2j ± 1).
(10)
110
IEEE TRANSACTIONS ON RELIABILITY, JUNE 1975
3.2 Compensated NMR
I.0
According to section 2.1, the reliability of a NMR system
with compensating failures is:
R(NMR)comp
OpR(NMR) +
n
i=1
(ff) R' (I
I
-
R
Pi =PO-=/2,Px=0
P =Po
/px -°3
= 035, w=0°2,Px=0 /5
9P
0.9
~~~~~~~~CLASSICAL
O)Nf-'
Pr{(n+1-i)-comp-NN-i)}.
L
(11)
0.50.5
-I
0.6
Substituting (2) and (5) into (11) yields:
R(NMR)c m_=
I (y) Rk (1-R
)I
I
A
0.8
This formula is the sum of the reliability of the classical NMR >/
and the probabilities of the compensating events. Compensat- ,
ing events are the ones where only a minority of the units sur- go07
vive (i units, i = 1 to n), but among the N- i units that failed ,/w
there are n + 1 - i compensations. The failed units have an
a/
effective number of votes of [(N-i)- 2(n+ 1-i)] = i- 1, _EW
and the vote of the i nonfailed units will determine the output. > 0.6
There are no restrictions about the time, between 0 and T, that
the several failures occur. The system will never be in a failure
situation, regardless of the order in which the unit failures occur.
S
SIMPLEX
p
0.7
Ro
0.8
0.9
o.0
Figure 1. COMPENSATED TMR,with severalvalues forp1, po andpx
Vk(N,p ,po)
(12)
The classical model does not consider compensating failures,
pensated model. Therefore, the crossover point occurs for the
classical TMR system at RO = 1/2, which agrees with previous
-k
No
Vk(N,Pl,P°)k
i0=n+l-k
results [7, 3] .
~~1=n+l-k
-ki,
is no crossover point in (15) for any values of pl, Po
(10io
P_ N-k-E1 -ioThere
pP)
0
io )po
and px such that pipo . 1/6. For other values there is a crossover point, but lower than the classical case (Figure 1). For
N i
j_
the cases (p1or po = 0), (p1= po = 1/2,px = O), (pI = Po=
Vk(fN, Pl,po) .
I() k1
Px = 1/3), the bound (15) holds for all NMR systems (Figures
' is just another form for the multinomial distribution depicting 2 and 3).
e factor Pr {(n + 1 -k)-comp-(N-k) }. For k > n 1, Vk is 1.
4.2 VOTER RELIABILITY
=
N
;
E bR'
n1 I 0
(pi I,)
=
(13) which is equivalent to making pI = O or po = O in the com-
(-1)(I1)
4. RESULTS AND APPLICATIONS
4.1 Crossover Point
In a majority decision structure there is a voting mechanism.
To assume this voting mechanism or monitor is perfect is to
oversimplify a problem. In order for the structure to operate
properly, the voter has to give accurate results, whether or not
there are faults in the units. Regarding the voter as a series
element in the reliability block diagram [6, 3] , the reliability
of a NMR structure is:
The crossover point is the minimum value of the reliability
of a nonredundant component for which there is improvement
in the reliability using a redundant system. It is geometrically
interpreted as the point where the curves for the redundant
and the nonredundant systems cross. For a compensated TMR R(NMR)* = R(NMR) * R .
(16)
system, the crossover situation is defined by:
It is useful to know the minimum value of the voter reliabil6R0p1p0 + 3Rg(1 - 4PPo) + 2R3(3p1P - 1) = Ro (14) ity for which there is gain in the system reliability over a simplex design:
where the left-hand side is obtained from (13) withN =3. The R mn=R(Simplex) =[NzbR- -l
17
lower bound of applicability of a TMR system is the nontrivial Rvmn
R(NMR)
L=[i1 b,R' j .17
root of(14):
Figures 4 and 5 plot Rv(min) versus Ro . Figure 4 shows the
(15) influence ofN in NMR systems, with Pi =Po =Px = 1/3.
Ro= (6pipo -1)/(6p1p0 -2)
111
MATHUR AND de SOUSA: RELIABILITY MODELS OF NMR SYSTEMS
1.0
1.0
0.9
9
0.9
w~~~~~~~~~~~
4t
N I /5
/
a
_.
~
0 0.7
w 0.7
at
w
~~~~CROSSOVER FOR ALL CURVES
>.
ol
IS AT SYSTEM REL=R0:=0.25
0.6
D~~~~~~~~~
0.61
0.0
0.2
0.4
0.6
R (SIMPLEX)
Figure 4. Rv(min) for COMPENSATED NMR systems withvp1
0.6
0.5
0.7
0.8
0.9
1.0
0.8
po,
1.0
I.0
Figure 2. COMPENSATED NMR systems with P1 p0 =px =1/3
TMR
0.6
0.8
0.9
.0
PI =O.25,p0=O.2,p)eO.55
0p1o:px= 1/3
-
9
0.9
CLASSICAL
6
w
5
0
> 0.7
0.8
-J
0.2
0.0
0. 7
w
0.4
R (SIMPLEX)
1.0
for COMPENSATED
Figure 5.PiRv(min)
= po =1/2,
(13 bcoesTMR with several values of pl,
I-.
p0
0.6
NO CROSSOVER
0.5
0.5
0.6
0.7
0.8a
Figure 3. COMPENSATED NMR, with p1 = po
NMR/SIMPLEX systems)
Therefore,, the applicability constraints of a compensated
0.9
=
1.0
1/2, px = 0 (also
Ro < (6pipo -1)/(6plpo -2) then R(TMR) <Ro, irrespective
of RV; 2)ifRV<(24p1po 8)I(24PsPo -9)thenR(TMR)<
R0, irrespective ofR0.
Fo.$
and px.
TMR system become tighter when the product of the parameters pi and po becomes smaller. The maximum value of
that product is 1/4 (p1 = po = 1/2).
4. NMR/SIMPLEX AS A PARTICULAR CASE OF
COMPENSTOATE NMRY
KTk4 1
2=1
1
R(2)
Q)2
z
i=n+1-2
('2fl+1 -Q)(8
.(8
112
IEEE TRANSACTIONS ON RELIABILITY, JUNE 1975
Equations (10) and (18) yield the same results. Thus, the
model of the NMR/Simplex system is a particular case of the
more general Compensated NMR system where p, = Po = 1/2
and p= 0.
The behavior of NMR/Simplex systems is depicted in Figure
3 for several values of N. The system reliability always increases
with N, because these curves do not have a crossover point
other than O or 1.
REFERENCES
[1] M. Ball and F.H. Hardie, "Architecture for Extended Mission
Aerospace Computer", IBM Report 66-825-1 753, Oswego, N.Y.,
[21 J.L. Bricker, "A Unified Method for Analyzing Mission Reliability for Fault-Tolerant Computer System", IEEE Transactions on
Reliability, Vol. R-22, pp. 72-77, June 1973.
[3] N.G. Dennis, "Reliability Analyses of Combined Voting and
Standby Redundancies", IEEE Transactions on Reliability, Vol.
R-23, pp. 66-75, June 1974.
[4] P.H. Giroux, Comments on "Estimates for Best Placement of
Voters in a Triplicated Logic Network", IEEE Transactions on
Electronic Computers, Vol. EC-15, p. 382, June 1966.
ACKNOWLEDGMENT
[5] P.G. Hoel, S.C. Port, and C.J. Stone, Introduction to Probabil-
The authors appreciate the many helpful suggestions of the
Editor and a referee.
ity Theory, Houghton Mifflin Company, Boston, 1971.
[61 F.P. Mathur, "On Reliability Modeling and Analysis of Ultra-
Reliable Fault-Tolerant Digital Systems", IEEE Transactions on
Computers, Vol. C-20, pp. 1376-1382, Nov. 1971, (Special Issue
APPENDIX
on Fault-Tolerant Computing).
[7] F.P. Mathur, "Automation of Reliability Evaluation Procedures
Through CARE-The Computer-Aided Reliability Estimation
Proof that NMR/Simplex Reliability Expression (10) is the
of the FJCC, Vol. 41,
Program", AFIPS Conference
solution for (9). The proof will
on n.
W1 be by induction
u
n on
n.pp. 65-82a, Anaheim, California,Proceedings
Dec. 1972.
be
Expression (10) can be written as:
[8] F.P. Mathur and A. Avizienis, "Reliability Analysis and Architec-
solutionsion(0).Thecanoot
written
n 1
R(NMR).Sim
22n+
ture of a Hybrid Redundant Digital System: Generalized Triple
2n +22 n
(-1)'
(n+
±
2j+i
Ri=O
i 2j + 1
(A-1)
1. For n= 1,
Modular Redundancy with Self-Repair", in 1970 Spring Joint
Computer Conference, AFIPS Conference Proceedings, Vol. 36,
Montvale, N.J., pp. 375-383, May 1970.
[9] D.S. Taylor, "A Reliability and Comparative Analysis of Two
Standby System Configurations", IEEE Transactions on Reliability, Vol. R-22, pp. 13-19, April 1973.
(10) yields:
[101 J. von Neumann, "Probabilistic Logics and the Synthesis of
R(TMR)Sim= 3/2 (Ro - 1/3Ro)
(A-2)
Reliable Organisms from Unreliable Components", Automata
Studies, C.E. Shannon and J. McCarthy, eds., Princeton University Press, Princeton, N.J., 1956, pp. 43-98.
which is the same as (6).
2. For the induction step, assume (A-1) is true for (n - 1),
i.e., for a (N - 2)MR system:
H)1
2i ±
1
n-I n- I
2n)
) )
(
n
~nn
R((N2)MR).
R ((N-2)MR)sim
Manuscript received March 11, 1974;revised August 10,1974, and
November 7,1974.
j=0
22n-1
(A-3)
R2j+ 1
Substituting (A-3) into (9):
sim
=
Francis Parkash Mathur (M'65) was born in California, on October 2,
N + pvjN
n
2
22n
1o
2n
)n
-
n-1 n-I
i=o
(I1
(-1)X jT eXt(Ne2t 1) dt
2j + 1
n-I
=
RN + An z
j=O
Bin (RO'+
-RN).
(A-4)
An andB.n are defined in (10). Equation (A4) reduces easily
to (10), since
n1=o jn
]
A z B = 1 .
(A-S)
Equation (1 0) is still valid for n = 0; it yields the Simplex reliability R0(N = 1). QE.D.
Francis P. Mathur/239 Electrical Engineering/University of Missouri/
Columbia, Missouri 65201 USA
1940. He received the B.E.E. (honors) degree from the National University of Ireland, University College, Dublin in 1963, the M.S.E.E. from
the University of California at Los Angeles (UCLA) in 1967, and the PhD
with distinction in Computer Science also from UCLA in 1970.
From 1963 to 1966 he was an Industrial Engineer with Consolidated
Electrodynamics Corp., Pasadena, California. In 1966 he joined the Jet
Propulsion Laboratory, California Institute of Technology where he
worked on the development of the Strapdown Electrostatic Aerospace
Navigator system before joining the Self Test And Repair computer
development project, with responsibilities in software development and
reliability analysis. In 1969 he was a recipient of NASA's Apollo Achieve-
ment Award. He left JPL as a member of the technical staff in '72 to
his current position as Associate Professor at the University of
~~~~~~~~~~~~~~~~accept
Missouri, College of Engineering, Bioengineering and Advanced Automation program.
is an advisor to the Sri Aurobindo International Center of Education, Pondichery, India where he spent the fall of '70 and summers of
'72, '73 and '74 assisting in the development of a Computing Center
~~~~~~~~~~~He
and in instituting a computer engineering department. He is a member
of the ACM and Sigma Xi, and is a Distinguished Visitor of the IEEE.
MATHUR AND de SOUSA: RELIABILITY MODELS OF NMR SYSTEMS
113
Paulo T. de Sousa/Electrical Engineering Department/University of
instructor in the Department of Electrical Engineering. He held a
Research Assistantship in the Bioengineering Program, Electrical Engineering Department, at the University of Missouri, where he is a candiPaulo Teixeira de Sousa was born in Nova Lisboa, Angola on January 25, date for the Ph.D degree. His current research interest is Fault-Tolerant
1947. He received the "licenciatura" degree in Electrical Engineering
Computing.
from the University of Luanda, Angola in 1971 and the M.S. degree
Mr. de Sousa is a member of Tau Beta Pi, Eta Kappa Nu, and ACM
from the University of Missouri at Columbia in 1972.
and a past Rotary Foundation Fellow.
After graduation in 1971, he joined the University of Luanda as an
n n x
Luanda/Luanda, Angola
Abstracts of Reliability Dissertations and Theses
Educational Institutions are invited to submit copies of their students'
Master's Degree Thesis or PhD dissertations which deal specifically with
some aspect of Reliability. As a service to Transactions readers and the
educational institutions, the dissertations and theses will be reviewed and
their abstracts published.
Title:
Author:
Integrated Circuit Reliability Prediction
H.R. Goldenberg
Degree:
Master of Science in Electrical Engineering
Thesis Advisor: Dr. M.L. Shooman
In this thesis a method is developed for calculating the average hazard
rate for a catastrophic-failure test using the test duration, the number of
failures, and the sample population. Author develops an equivalency
criterion which allows a transformation between hazard model shapes.
The model shapes examined in detail are the constant, the Weibull, and
the piece-wise linear shapes. It is :shown that significant errors in reliability can result when constant hazard is assumed.
The principal failure modes of digital integrated circuits are examined,
and are related to the complexity of the circuit by qualitative arguments. Various proposed reliability and complexity models are examined
to determine what measures of complexity have previously been used.
These measures, and the additional ones formulated by the author, are
compared in a pairwise regression analysis to determine the degree of
correlation present. The analysis is performed for the TTL 7400 series,
and the Schottky 74S00 series devices. Author shows that the hazard
function of the integrated circuits is a Weibull function of time with a
scale parameter proportional to the number of gates in the device. n 3 m
Title:
Author:
Degree:
School:
Dissertation
Directed by:
Reliability Analysis of Transmission Systems of
Regular Distributed Structures.
R.J. Morgan
Dr. of Philosophy
University of New South Whales
Dr. W.H. Holmes, Associate Professor of Electrical
Engineering
This two-part dissertation addresses the problem of reliability analysis
of a redundant transmission distributed system whose structure is more
general than a series-parallel scheme. In part I, the author treats the system as four-state, homogeneous Markov chain. The main contribution
developed here concerns the limit theorem of system reliability. The
theorem states that in the limit as system length becomes infinite, the
reliability of the distributed system approaches the reliability of the
series system of equivalent elements. The significance of the theorem
lies in the fact that it extends the Messinger-Shooman results to a more
general class of structures. The theorem also contributes to the field of
graph theory in that it describes a "weak" connectivity property of cascade connected bipartite graphs, under the constraint of system
T. L. Regulinski
Senior Member IEEE
"growth" in a single dimension. In part Il, the author treats the system
in continuous domain where system branches represent amplifiers, and
nodes perform as dual input repeaters. Corresponding to each amplifier
is a pdf describing the uncertainty associated with amplifier gains. The
problem considered is one of obtaining the best decision at each repeater in order to maximize reliability as length increases. It is shown
that the decision function derived for the series-parallel system is quite
different from the one for the distributed system due to the presence of
an additional failure mechanism in the latter case. The generation of the
optimum decision function is the major contribution contained in the
second part of this dissertation.
Portions of this work appear in Morgan & Holmes, "Reliability analysis
of regular chain structures," IEEE Trans. Reliability vol. R-23, April
1974, pp 11-16.
Title:
Author:
Degree:
School:
Dissertation
Directed by:
Deteriorating Markov Processes Under Uncertainty
D.B. Rosenfield
Dr. of Philosophy in Operations Research
Stanford University
Dr. Gerald Lieberman
A model is developed that represents a deteriorating Markov process
with imperfect or costly information. The process might be, for example, a deteriorating machine or inventory system with several states.
In the context of the machine example, at each time period, the machine operator has three possible actions to choose from. If repair is
chosen, an expected repair cost is incurred and the system reverts to the
best state. If inspection is chosen, an inspection cost and an expected
operating cost are incurred, and the operator determines exactly which
state the system will be in at the beginning of the next time period.
Finally, if no action is chosen, an expected operating cost is incurred,
and the operator obtains no new information about the state of the
process. Of course, such processes have been studied by others; however,
under the imperfect information assumption the results are incomplete.
Under the structure assumed, author characterizes a state space of the
process as observed by the operator. In observed state (i, k), the operator
knows that k time units ago, the underlying Markov process was in state
i, and that no new information has been gathered in k time units. Under
straight forward assumptions on costs and under the assumptions that
the Markov matrix P is IFR or stochastically increasing and Pij = 0 for
j K i, author shows that there are numbers k*(i) non-increasing in i such
that it is optimal to repair if k > k*(i) in state (i, k) and optimal not to
repair otherwise. That optimality holds for the n-period, infinite-horizon
(discounted), and average-cost criteria. Under the stronger assumption
that P is totally positive of order two (TP2), author further shows that,
under the latter two criteria, the interval k E [0, k*(i) - 1 ] for state
(i, k) can be broken into at most three additional regions: a no-action
optimal region, an inspection-optimal region and a second no-action
optimal region.
nnr