Apollo Program: Procedure FOR Failure Mode, Effects, AND Criticality Analysis (Fmeca)
Apollo Program: Procedure FOR Failure Mode, Effects, AND Criticality Analysis (Fmeca)
Apollo Program: Procedure FOR Failure Mode, Effects, AND Criticality Analysis (Fmeca)
...
.:.:.:
APOLLO PROGRAM .g ii.
.e.:fi.
.
PROCEDURE
FOR
FAILURE MODE, EFFECTS, AND
CRITICALITY A N A L Y S I S
(FMECA) i
--
AUGUST 1966
-
. REPRODUCED BY
NATIONAL TEC HNI C A L
INFORMAT ION SERVICE
US. DEPARlYENl OF COMMERCE
SPRINGFIELD, VA. 22161
T H I S DOCUM.ENT HAS B E E N R E P R O D U C E D
FROM T H E B E S T COPY FURNISHED U S B Y
TEE S P O N S O R I N G A G E N C Y . A L T H O U G B I T
I S RECOGNIZED T E A T CERTAIN PORTIONS
AR’E I L L E G I B L E , I T IS B E I N G R E L E A S E D
I N T H E I N T E R E S T O F MAKING A V A I C A B L E
A S MUCH I N F O R M A T I O N A S . P O S S I B L E .
’.
I
d -0067
d 7 RA- -6 /77
0 13- 1A
PROCEDUREFOR
FAILURE MODE, EFFECTS, AND CRITICALITY ANALYSIS
(FMECA)
August 1966
Prepared by
Apollo Reliability and Quality Assurance Office
National Aeronautics and Space Administration
.
Washington, D. C 20 546
PREFACE
This document is an official release of the Apollo Program Office. Many of the
procedures and methods a r e already being carried out. The extent to which this
guideline should be implemented at the present stage of program matrurity should
be evaluated by comparing the benefits to be derived therefrom with the problems
of implementation, including cost.
The principal criteria in judging the value of applying all the procedures of this
guideline a r e the need for these procedures to accomplish identification and rank-
ing of potential failures critical to hardware performance and crew safety. Other
considerations, such as, design/development testing, noncriticality of the equip-
ment to system operational success, past experience, and reliability analyses,
may preclude the need to perform all the procedures of this guideline.
n Q
94 's J .
Willoughby
Acting Director
L
SECTION 1-INTRODUCTION
Preceding page
_- - -
blank
_ .
V
TABLE OF CONTENTS (Cont.)
Paragraph -
Title Page
APPENDIX B-DEFINITIONS B- 1
LIST OF ILLUSTRATIONS
INTRODUCTION
1.1 PURPOSE
1.2 SCOPE
This document is applicable to all NASA activities with cognizance over design,
development, and test of Apollo flight, ground, and related equipment which have
major impact on mission success. It may be invoked in equipment contracts in
whole or in part, where design or development is involved, as a portion of the re-
liability engineering and as the guideline for carrying out the activity, predicated
on budget considerations, equipment criticality, schedules, and other factors.
The ground rules for the use of FMECA may call for substitute overstress tests
on structural parts o r for other design/development tests of the system in place
of the FMECA, o r these rules may not require anFMECAon those parts of the
system that are established by preliminary FMECA to be noncritical to system
operational success.
1-2
E L L
1.6 FMECA RELATION TO THE RELIABILITY PREDICTION, ASSESSMENT
AND CREW SAFETY MODELS
FMECA provides quick visibility of the more obvious reliability problems ranked
according to their importance to system operational success. Changes made in
the system to remove or reduce these more obvious reliability problems will
usually restructure major parts of the system. This will make the more detailed
analysis of the reliability models an inefficient process for upgrading system re-
liability during the early stages of design when changes are being made rapidly;
hence, the FMECA is particularly appropriate during this period. The FMECA
should be reviewed by the designer on a timely basis.
FMECA is performed in two basic steps: (1) Failure Mode and Effects Analysis
(FMEA) and (2) Criticality Analysis (CA). The combination of these two steps
provides: (3) Failure Mode Effects and Criticality Analysis (FMECA) e Section2
provides step-by-step procedures for FMEA; Section 3 provides step-by-step
procedures for CA; and Section 4 combines the FMEA and CA into the FMECA.
1-3
SECTION 2
2.1.1 ACCOMPLISHMENT
The FMEA assumes that only the failure under consideration has
occurred. When redundancy o r other means have been provided in
the system to prevent undesired effects of a particular failure, the
redundant element is considered operational and the failure effects
terminate at this point in the system. When the effects of a failure
propagate to the top level of a system and cause the system to fail,
the failure is defined a s a critical failure in the system.
2-2
L e
2.1.2 DOCUMENTATION
To define what constitutes and contributes to the various types of system failure,
the technical development plans for the system should be studied. The plans will
normally state the system objectives and specify design requirements for opera-
tions, maintenance, test, and activation. Detailed information in the plans will
normally provide a mission o r operational profile and a functional flow block dia-
gram showing the gross functions that the system must perform. Time diagrams
and charts used to describe system functional sequence will aid the analyst to de-
termine the time feasibility of various means of failure detection and correction
in the operating system. Also required is a definition of the operational and en-
vironmental stresses that the system is expected to undergo and a list of the ac-
ceptable conditions of functional failure under these stresses.
To determine the possible and more probable failure modes and causes in the
system, trade-off study reports should identify the areas of marginal design and
should explain the design compromises and operating conditions agreed upon.
The descriptions and specifications of the system's internal and interface func-
tions, starting at the highest system level and progressing to the lowest level of
system development to be analyzed, a r e required for construction of the FMEA
reliability logic block diagrams. A reliability logic block diagram as used in the
FMEA and as described in paragraph 2.1.1.6 shows the functional interdependen-
cies within the system and permits the effects of a failure to be traced. System
descriptions and specifications usually include either o r both functional and equip-
ment block diagrams that facilitate the construction of the reliability logic block
diagrams required for the FMEA. In addition, the system descriptions and spec-
ifications give the limits of acceptable performance under specified operating
and environmental conditions.
2-3
2.1.2.4 Equipment Design Data and Drawings
Equipment design data and drawings identify the equipment configuration perform-
ing each of the system functions.
Tests run on the specific equipment under the identical conditions of use a r e de-
sired. When such test data are not available, the analyst should collect and analyze
the data obtained from studies and tests performed during current and past pro-
grams on equipment similar to those in the system andunder similar use conditions.
The next step of the FMEA procedure is the construction of a reliability logic
block diagram of the system to be analyzed. The general reliability logic block
diagram scheme for a system is shown in Figure 2-1. This example system is
for a space vehicle stage, and the notes given explain the functional dependencies
of the stage components.
A system component at any level in the stage system may be treated as a system
and may be diagrammed in like manner for failure mode and effects analysis.
The results of the component's FMEA would define the failure modes critical to
the component's operation, i. e. , those that cause loss of component inputs o r
outputs. These failure modes will then be used to accomplish the FMEA at the
2-4
3
2- 5
next higher system level. This procedure ultimately leads to an FMEA for the
stage, the space vehicle, and space system.
All system redundancies o r other means for preventing failure effects are shown
in the reliability logic block diagram. This is because in single failure analysis,
when a means exists to prevent the effects of a failure, the failure cannot be criti-
cal above the system level where the preventive means is effective.
The FMEA and its documentation a r e the next steps of the procedure. These a r e
accomplished by completing the columns of an FMEA format similar to that given
in Figure 2-2 as follows :
Column
Number Explanation o r Description of Entries
__ . .. . __
Preceding page blank
Column
Number Explanation o r Description of Entries
2-10
1;: C t
SECTION 3
a. Identify critical failure modes of all components in the FMEA for each
equipment configuration in accordance with the categories listed in
paragraph 3 . 2 . For FMEA's of lower level systems where the effect
of failure modes on mission success o r crew safety cannot be deter-
mined, the critical failure modes will be those that cause failure of
one o r more of the system's inputs or outputs.
b. Compute Critical Numbers (C,) for each system component with criti-
cal failure modes. The method is given in paragraph 3 . 3 , and a for-
mat for the data is shown in Figure 3-1.
The first step of CA is the identification of critical failure modes fromthe FMEA's
on the system.
Critical failure modes at higher levels in the overall space system should be
identified according to approved nonambiguous loss statements. The follow-
ing categories, according to Reference 5, Appendix A, paragraph 3.3.3, may
be used:
Category 1-Hardware, failure of which results in loss of life of any crew mem-
ber. This includes normally passive systems, i. e. , emergency de-
tection system, launch escape system, etc.
Category 2-Hardware, failure of which results in abort of mission but does not
cause loss of life.
Category 3-Hardware, failure of which will not result in abort of mission nor
cause loss of life.
Category B-Hardware, failure of which results in abort of mission but does not
cause loss of life.
Category C-Hardware, failure of which will not result in abort of mission nor
cause loss of life.
A t the lower system level where it is not possible to identify critical failure modes
according to loss statements under the six categories above, approved loss state-
ments based upon loss of system inputs o r outputs should be used (See para-
graph 3.la.). Kennedy Space Center loss statements can be found in Reference 9
of Appendix A. Marshall Space Flight Center loss statements can be found in
Reference 8 of Appendix A.
The loss statement used to identify a critical failure mode in a system should be
prefixed with the word "actual, " "probable, " "possible, " o r "none'' which repre-
sents the analyst's judgment as to the conditional probability that the loss will oc-
cur given that the failure mode has occurred.
The second step of the CA procedure is the calculation of Criticality Numbers (C,)
for the system components with critical failure modes.
For a particular loss statement and mission phase, the C r for a system compo-
nent with critical failure modes is calculated with the following formula:
'r n = l , 2, 3 , ..., j
n
n=l
where :
3-3
KE = Environmental factor which adjusts h~ for difference between envi-
ronmental stresses when hG was measured and the environmental
stresses under which the component is going to be used.
-
The expression (PcvKEKAhGt lo6) is the portion of Cr for the component due to
one of its critical failure modes under a particular loss statement. After calcu-
lation of the part of C r due to each of the component's critical failure mode under
the loss statement, these parts a r e summed for all critical failure modes as indi-
cated by
n=l
,t.
A failure mode failure rate is represented in the formula for Cr by the product of
the terms Q! , KE , KA, and AG. These terms should be replaced by actual failure
mode failure rates determined from the test program as they become available.
A sample calculation is given on the following page.
3.3.1 Cr CALCULATION EXAMPLE
Given: System component with hG = 0.05 failures per lo6 operating hours,
KA = 1 0 , KE = 50,
a = 0.30 for one critical failure mode under loss statement, and
a = 0.20 for the second critical failure mode under the same loss
statem ent .
Solution:
j = 2 and
‘r (/3aKEKAhGt* lo6) = 38 + 25 = 63
n=l n
The columns of the format for C calculations shown in Figure 3-1 should be filled
r
out as follows:
Column
Number Explanation o r Description of Entries
3-5
Column
Number Explanation o r Description of Entry
(10) - (16) Enter the information required for the calculation of the
portion of the component's criticality number due to
each of its critical failure modes.
The procedure is a method for combining the criticality values by mission phase
to develop an overall summary.
Preparation of the FMECA summary is developed from the FMEA and CA anal-
ysis discussed in Sections 2 and 3 and is accomplished by completing a form
similar to that given in Figure 4-1. Instructions for completing the form are
given below.
4-1
Column
Number Explanation o r Description of Entries
(10) - (12) Where the critical failure mode has an effect during
Phase 2 of the mission, columns (10)-(12) are completed
in the same manner as in columns (7)-(9). This format
should be extended to include all mission phases.
The last step of the FMECA is the preparation of the criticality list. Critical
system components are grouped according to loss statement and are listed in the
groups in descending order according to magnitude of their total criticality number
4-2
I
REFERENCE DOCUMENTS
2. NASA Quality Publication NPC 200-2, April 1962, "Quality Program Pro-
visions for Space System Contractors, f' paragraph 4.3.1.
4. NASA Publication NHB 5300.1, October 1965, l!Apollo Reliability and Qual-
ity Assurance Program Plan,"paragraphs 2.2.3.d, 2.2.4.8, 4.1.a,
4.2.b.(5), 4.7, 5.2.2, 5.2.4, 5.3.1, 5.4, and5.5.
10. NASA Kennedy Space Center document TR-4-49-3-D, Revised 1 July 1964,
"Determination of Criticality Numbers for Saturn I, Block Vehicle Ground
Equipment (Launch Complex 37B). If
DE FINITIONS
APOLLO-A term generally used to describe the NASA Manned Lunar Landing
Program but specifically used to describe the effort devoted to the development
test and operation of the space vehicle for long duration, Earth orbit, circum-
lunar, and lunar landing flights,
B- 1
launchof a large o r complicated rocket vehicle, o r in leading up to a captive test,
a readiness firing, a mock firing, o r other firing test. 2. The act of counting
inversely during this process.
In sense 2, the countdown ends with T-time; thus, T minus 60 minutes indicates
there a r e 60 minutes left except for holds and recycling. The countdown may be
hours, minutes, o r seconds. At the end, itnarrowsdownto seconds, 4-3-2-1-0.
CREW-A group of ground and flight specialists who perform simultaneous and se-
quential duties and tasks involved in the accomplishment of an assigned operation.
CREW SAFETY-Safe return of all three flight crew members whether o r not the
mission is completed.
CRITICAL DEFECT-A defect that judgment and experience indicate could result
inhazardous o r unsafe conditions for individuals using o r maintaining the product
o r could result in failure in accomplishment of the ultimate objective.
CRITICALITY PARTS LIST-A listing of those parts whose failure would cause a
degradation in mission success o r c r e w safety.
DESIGN REVIEW-A progressive review, starting after the design study and con-
tinuing through the prototype stage. Provides an assessment of reliability and
reliability trends by use of applicable tests and prediction techniques.
FLIGHT CREW-The Apollo flight crew consists of three men who are cross-
trained to be capable of manning any of the Command Module (CM) duty stations.
The three crewmen a r e designated commander, navigator, and systems manager.
The CM commander is also the Lunar Excursion Module (LEM) commander.
The GSE is not considered to include land o r buildings; nor does it include the
guidance-station equipment itself , but it does include the test and checkout equip-
ment required for operation of the guidance-station equipment.
-
HOLD-During a countdown, to stop counting and to wait until an impediment has
been removed so that the countdown can be resumed, as in T minus 40 and holding.
B-3
MAINTENANCE-The function of retaining material in o r restoring it to a ser-
viceable condition.
-
PART-1. One of the constituents into which a thing may be divided. Applicable
to a major assembly, subassembly, o r the smallest individual piece in a given
thing. 2. Restrictive. The lease subdivision of a thing; a piece that functions
in interaction with other elements of a thing but is itself not ordinarily subject
to disassembly.
PRELAUNCH-The phase of operations, beginning with the arrival of space vehicle
elements at the launch site and ending with the start of the launch countdown.
Parallel redundancy applies to systems where both means are working at the same
time to accomplish the task and when either of the systems is capable of handling
the job itself in case of failure of the other system. Standby redundancy applies to
a system where there is an alternative means of accomplishing the task that is
switched in by a malfunction sensing devicewhen the primary system fails.
RELIABILITY-Of a piece of equipment or a system, the probability of specified
performance for a given period of time when used in the specified manner.
SYSTEM-1. Any organized arrangement in which each component part acts, re-
acts, o r interacts in accordance with an overall design inherent in the arrange-
ment. 2. Specifically, a major component of a givenvehicle such as a propulsion
system o r a guidance system. Usually called a major system to distinguish it
from the systems subordinate or auxiliary to it.
In sense 2 , the system embraces all its own subsystems including checkout equip-
ment, servicing equipment, and associated technicians and attendants. When the
term is preceded by such designating nouns as propulsion o r guidance, it clearly
refers to a major component of the missile. Without the designating noun, the
term may become ambiguous. When modified by the word major, however, it
loses its ambiguity and refers to a major component of the missile.
B-5
L t