Introduction To Operational Availability
Introduction To Operational Availability
Introduction To Operational Availability
1.1 Introduction
This handbook presents a practical overview of the concept of operational availability and
several supportability measures and their use in different phases of a system's1 life cycle. It is
hoped that better understanding of the metrics involved and their derivation will provide insight to
program sponsors and acquisition managers as they develop and manage their programs.
Project Managers must be able to assess system performance readiness metrics during the
acquisition process, prior to initial operational capability (IOC), and throughout the deployment
cycle, providing feedback critical to ensuring that the user can affordably support the system. This
handbook is intended to be a practical guide; however although several useful equations are
provided, it is not intended to be an exhaustive mathematical or engineering treatise.
This handbook is based on one initially developed, by the Department of the Navy in the mid
1980s to address the combined consideration of Ao and cost in all levels of systems acquisition and
design related decision-making. This handbook generalizes and broadens the application of the
concepts and incorporates the tenets of acquisition reform, organizational re-alignment, and
provides additional clarity to the interaction between Ao and cost of ownership.
Common use of terms is essential in this kind of handbook. The DoD and defense industry
have defined material readiness as one of two prime Figures Of Merit (FOM) to be used for
acquisition program decision support. The first FOM is the equivalent of material readiness or
hardware availability. The second FOM is Total Ownership Cost (TOC) of the system or
equipment under consideration. TOC for purposes of this handbook is equivalent to cost of
ownership. Although many of the terms and initiatives discussed herein are unique to the military,
the basic concepts are also applicable to industrial and commercial products.
1.2 Understanding Ao
The next few paragraphs provide insight to availability and several other important metrics; a
more detailed treatise follows in later sections.
Availability can be generally defined as the probability that a system will be ready to perform
its mission or function under stated conditions when called upon to do so at a random time. It is a
1
For convenience, the term "systems" is used in this handbook to include military weapons systems, industrial systems, and
commercial products.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
2 Operational Availability Handbook OPAH
term associated with systems that can be repaired or have other maintenance performed. As such,
availability is a function of how often the system fails (a function of reliability) and how long it
takes to restore the system to an operational condition after a failure occurs (a function of
maintenance and support). For systems for which no maintenance is possible or practical (not even
inspections or servicing), availability is equal to the system reliability. Reliability can be defined
as the probability that a system will perform its function(s) as required when used under stated
conditions for a given interval of time without failure.
When the general definition of availability is modified to assume ideal support (i.e., unlimited
spares, no delays, etc.) and only design- or manufacturing-related failures are considered, we have
inherent availability (Ai). Ai reflects the level of reliability and maintainability (R&M) achieved
in the design and realized through the manufacturing, assembly, and, in some cases, installation
processes.
When a realistic support environment is considered and all maintenance actions, even those
not required as a result of design- or manufacturing-related failures, are considered, we have
operational availability (Ao). Ao is a function of reliability, maintainability, and supportability.
Every effort should be made to explicitly consider each element of Ao in early development and
throughout the system's life cycle. As you use this handbook, keep two important things in mind;
first and foremost, operational availability is a key element in determining system readiness2 and
a supportability goal. Second, the system design does not solely determine Ao, but dictates a
maximum level of availability based only on the designed-in levels of R&M. Reliability is often
expressed in terms of the Mean Time Between Failure (MTBF) and maintainability in terms of
Mean Time To Repair (MTTR).
Figure 1.2-1 helps us to better understand the difference between Ai and Ao. Note that no
matter how it is measured, availability can never be more than 100% (1.0) or less than 0.
2
Readiness is a broader term that accounts for the number and level of training of operating personnel; command, control, and
communications, mobility; planning; strategy and tactics; and other factors.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
OPAH Section 1: Introduction to Operational Availability (Ao) 3
will not fail to perform its function(s) when used under stated conditions over a defined time
period. When the times to (for non-repairable items) or times between failures are exponentially
distributed, the equation for reliability is:
-λ
R(t) = e
where:
Reliability, being a probability, can take on any value between 0 and 1. Often reliability is
expressed as MTBF. For the exponential distribution of failure times, the MTBF is the inverse of
the failure rate ( λ ). For example, if a system failure rate is 5 failures per thousand hours, it follows
that the MTBF is equal to 200 hours.
Consolidating the ideas in the definitions found in various references and adding the idea of
economy, yields the following definition:
Maintainability. The relative ease and economy of time and resources with which an item
can be retained in, or restored to, a specified condition when maintenance is performed by
personnel having specified skill levels, using prescribed procedures and resources, at
each prescribed level of maintenance and repair. In this context, it is a function of design.
As stated in the last sentence of the definition, maintainability is a design parameter. Although
other factors, such as highly trained people and a responsive supply system, can help keep
downtime to an absolute minimum, it is the inherent maintainability that determines this minimum.
Improving training or support cannot effectively compensate for the effect on availability of a
poorly designed (in terms of maintainability) product. Designing the product to be reliable and
maintainable is the best way to minimize the cost to support a product and maximize the
availability of that product.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
4 Operational Availability Handbook OPAH
Mean Preventive Maintenance Time ( M pt ) . A composite value representing the arithmetic average of the
maintenance cycle times for the individual preventive
maintenance actions (periodic inspection, calibration, scheduled
replacement, etc.) for a system.
Median Active Corrective Maintenance Time That value of corrective maintenance time that divides all
( M̃ ct ) . downtime values for corrective maintenance such that 50% are
equal to or more than the median and 50% are equal to or less
than the median.
Mean Downtime (MDT). The mean or average time that a system is not operational due to
repair or preventive maintenance. Includes logistics and
administrative delays.
Design guides and analysis tools must be used rigorously to ensure a testable design. Not
doing so leads to greater costs in the development of manufacturing and field tests, as well as in
the development of test equipment. Tradeoffs must be made up front on the use of built-in-test
(BIT) versus other means of fault detection and isolation. Further, the expected percentage of
faults that can be detected and isolated to a specified or desired level of ambiguity must be
determined as an important input to the logistics analysis process. The consequences of poor
testability are higher manufacturing costs, higher support costs, and lower customer satisfaction.
Reliability and maintainability are often considered the foundation of availability. Both are
primarily determined during design. Once the equipment is designed and built, reliability and
maintainability can be modified only, with minor exceptions, by changing the physical design of
the equipment.
However, operational availability is not just a function of design but also of maintenance
policy, the logistics system, and other supportability factors. It can be improved by improving the
design, improving the support, or both.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
OPAH Section 1: Introduction to Operational Availability (Ao) 5
costs. The following key points are provided as a preview of the major issues that will be
addressed:
• The Resource Sponsor, with assistance from the developing agency and others, must
document Ao as a Key Performance Parameter (KPP) in requirements documents.
• To understand and effectively evaluate Ao and cost during the systems acquisition
process, the resource sponsor and others must become familiar with the separate
components of the Ao index. These are reliability, maintainability, and supportability.
• Every effort should be made to explicitly consider each element of the Ao metric
throughout the system life cycle. The program team and the user must understand that
major changes to or deviations from the user requirements or the designated operational
scenario requirements may have an impact upon the observed Ao. In addition, if spares
availability are reduced for any reason (budget or supply chain), the cannibalization rate
will increase and the readiness, as observed by the user, will decrease.
The handbook is intended to be used to influence the design for readiness, supportability, and
life cycle affordability. Pure design-related analysis is left to other references. Systems are
described in terms of a number of important performance parameters in today's "performance
based business environment." Examples of many of these parameters are shown in Figure 1.3-1.
Some will be identified as KPPs for specific programs, but all are important in the systems
engineering program. This handbook concentrates on just three of these parameters: reliability,
maintainability, and certain aspects of the logistics support system. These three are the drivers of
Ao and TOC, and can be used to focus the design and management teams at all levels of program
decision-making.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
6 Operational Availability Handbook OPAH
SYSTEM EFFECTIVENESS
Ao and cost both satisfy the classic definition for a good Measure of Effectiveness/Figure of
Merit (MOE/FOM).
• They represent the viewpoint of the stakeholders, i.e., those who have the right and
responsibility for imposing the requirements on the solution.
• They assist in making the right choice by indicating "how well" a solution meets the
stakeholders needs.
In his book, Logistics Engineering And Management,3 Dr. Benjamin Blanchard states: "The
use of an effectiveness FOM is particularly appropriate in the evaluation of two or more
alternatives when decisions involving design and/or logistics support are necessary. Each
alternative is evaluated in a consistent manner employing the same criteria for evaluation."
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
OPAH Section 1: Introduction to Operational Availability (Ao) 7
Ao is a major contributor to Systems Effectiveness (SE). Although the exact definition and
elements of SE can vary, Figure 1.3-1 shows some of the elements that may contribute to SE.
Figure 1.3-1 shows that there are many candidate trade-off parameters in the capability,
dependability, and availability areas.
Figure 1.3-1 also shows how these factors are related. Operational Capability (Co) refers to
the system's operating characteristics (range, payload, accuracy, and the resultant ability to counter
the threat). Co is the ability to counter the threat, in terms such as system performance, probability
of kill, etc. Ao refers to the probability that the system will be ready to perform its specified
function, in its intended operational environment, when called for at a random point in time.
Operational Dependability (Do) refers to the probability that the system, if up at the initiation of
the mission, will remain up throughout the mission. Operational capability, operational
availability, and operational dependability must be defined relative to the specific operational
environment and operating scenario envisioned for a given system. Combined, they determine
system effectiveness (SE). The system effectiveness of a specific system determines in large
measure the effectiveness of the ship or aircraft platform on which it is installed.
In addition to the following paragraphs, additional applicable terms, concepts, and acronyms
are defined in Appendix B.
For decades, effective logistics managers have used models as part of the Supportability
Analysis process. A model is a representation of systems, entity, phenomenon, or process. Two
models are the Level of Repair Analysis (LORA) model, sometimes called the Repair Level
Analysis (RLA), and the Life Cycle Cost (LCC) model. In addition, simulation models are used
to assess achieved readiness. Many organizations have published guidance on the use of these
models. Each model has an extensive user manual. In the following sub paragraphs, the two
models are described in general terms. Appendix F provides some relevant web sites, both
commercial and government, which have information on models currently in use and new products
in development.
The purpose of the LORA model is to solve for the lowest life cycle cost repair level for each
of the repairable candidates in a subsystem work breakdown structure (WBS). A LORA model is
normally run at the subsystem level such as a radar set or propulsion system.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
8 Operational Availability Handbook OPAH
Inputs to the model include the system reliability, maintainability, weight, cube, volume, etc.
Also, data concerning logistics element resources needed to repair each of the candidates at each
of the three levels of maintenance traditionally used for many systems. These levels are
Organizational (O), Intermediate (I), and Depot (D). The model then goes through the following
steps:
1. It first assumes that all candidates are non-repairable and are discarded upon failure at the
O-level. Considering failure rates and the time to obtain replenishment spare from the
source, the model calculates how many assemblies must be kept at each O-level site to
satisfy requisitions. The model stores all costs for each repairable candidate.
2. The model next assumes all repairable candidates are sent to the D-level for repair. The
model calculates all logistics elements required for repair of each candidate. The model
again stores all of these costs by repairable candidate. This includes the reduced number
of spares now needed at the O-level.
3. Next the model assumes all repairable candidates are repaired at the I-Level with sub-
assemblies and repair parts going to the depot for repair. All of these costs are stored by
repairable and by ILS element.
4. Finally, the model optimizes the repair level by comparing the relative costs for each
repairable candidate for each of the options (i.e., discard at O-Level, repair at D-Level, or
repair at I-Level), and selects the least cost option for each repairable candidate.
The model provides a comprehensive report for consideration by the analyst and lead
logistician. The model assists the logistician in assigning a Source, Maintenance, and
Recoverability (SM&R) code that defines where an item is removed and replaced (R&R) and
where it is repaired. This key information is published in planning documents to guide logistics
planners and also becomes input data for LCC and Ao models.
The main purpose of a LCC model is to estimate the total costs associated with developing,
acquiring, operating, supporting, and, at the end of its useful life, disposing of a system. A
significant part of the LCC associated with any military system is the costs for initial logistics
elements, which are procured with acquisition dollars and the annual and total Operating and
Support (O&S) costs. In order for a complete LCC report to be produced, the LCC model must
have the capability to capture R&D costs as inputs. Although the elements of LCC can be
categorized in different ways, Figure 1.5-1 depicts a typical categorization of LCC elements. Note
that not all of the cost elements shown in the figure will be applicable to all systems and products.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
OPAH Section 1: Introduction to Operational Availability (Ao) 9
Initial Logistics
Support Other
Sustainment
Costs
Technical Data
Other Research
and Development
Costs
Figure 1.5-1. Typical Categorization of LCC Elements
Ao affects operations at the organizational level. It is a measure of the percent of time that an
operational system is up and ready for use at any random point in time. When the system
experiences a failure, the maintenance personnel must isolate the cause of the failure, remove and
replace the failed item (or repair in place), and retest the system to verify that proper operation has
been restored. The rapidity with which maintenance can be performed is a function of the R&M
of the system and the efficiency and responsiveness of the support system. One key to
responsiveness is having the right number of "spares" available when needed.
A model for sparing to sustain a given level of Ao needs essentially the same input data as LCC
and LORA models. Operational needs, logistics infrastructure, and hardware information is fed
into the model. The sparing to availability model calculates the number of each type of spare part
to be kept at each maintenance level site to satisfy an Ao target value.
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER
10 Operational Availability Handbook OPAH
The model essentially divides the spares budget target by the failure rate for each spare part
candidate. This process creates an index representing readiness per dollar spent for each part. The
part with the highest index is selected. The calculations and selections are repeated until the Ao
target is reached, constrained by the spares budget target.
The milestones and program phases for military acquisition are illustrated in Figure 1.5-2. The
figure is an adaptation of the model published in the Department of Defense 5000 series in October
2000. This general framework will be used throughout this handbook. Generally, all complex
system acquisition programs will follow a similar sequence of design, production, deployment, and
sustainment phases.
Figure 1.5-2. Acquisition Model Based on DoD 5000 Series Dated October 2000 (Note: The
5000 series were in revision when this document was written. However, with some mapping, the
tasks and events described herein are still applicable.)
Reliability Analysis Center (RAC) • 201 Mill Street, Rome, NY 13440-6916 • 1-888-RAC-USER