Evaluating The Effects of Aging On Electronic Instrument and Control Circuit Boards and Components in Nuclear Power Plants
Evaluating The Effects of Aging On Electronic Instrument and Control Circuit Boards and Components in Nuclear Power Plants
Evaluating The Effects of Aging On Electronic Instrument and Control Circuit Boards and Components in Nuclear Power Plants
Technical Report
Evaluating the Effects of Aging on
Electronic Instrument and Control
Circuit Boards and Components in
Nuclear Power Plants
1011709
Electric Power Research Institute • 3412 Hillview Avenue, Palo Alto, California 94304 • PO Box 10412, Palo Alto, California 94303 • USA
800.313.3774 • 650.855.2121 • askepri@epri.com • www.epri.com
DISCLAIMER OF WARRANTIES AND LIMITATION OF LIABILITIES
THIS DOCUMENT WAS PREPARED BY THE ORGANIZATION(S) NAMED BELOW AS AN
ACCOUNT OF WORK SPONSORED OR COSPONSORED BY THE ELECTRIC POWER RESEARCH
INSTITUTE, INC. (EPRI). NEITHER EPRI, ANY MEMBER OF EPRI, ANY COSPONSOR, THE
ORGANIZATION(S) BELOW, NOR ANY PERSON ACTING ON BEHALF OF ANY OF THEM:
ORDERING INFORMATION
Requests for copies of this report should be directed to EPRI Orders and Conferences, 1355 Willow
Way, Suite 278, Concord, CA 94520, (800) 313-3774, press 2 or internally x5379, (925) 609-9169,
(925) 609-1310 (fax).
Electric Power Research Institute and EPRI are registered service marks of the Electric Power
Research Institute, Inc.
Copyright © 2005 Electric Power Research Institute, Inc. All rights reserved.
CITATIONS
This report describes research sponsored by the Electric Power Research Institute (EPRI) and the
U.S. Department of Energy (Award No. DE-FC07-03ID14536. Task #6).
The report is a corporate document that should be cited in the literature in the following manner:
Evaluating the Effects of Aging on Electronic Instrument and Control Circuit Boards and
Components in Nuclear Power Plants. EPRI, Palo Alto, CA, and U.S. Department of Energy,
Washington, DC: 2005. 1011709.
iii
REPORT SUMMARY
Circuit boards used in the electronic instrument and control (I&C) systems of nuclear power
plants may suffer from aging failures that can cause a plant trip or unavailability of plant
systems. The overall objective of this study was to determine how precursors of failures in I&C
circuit boards can be measured and how these measures can be used to estimate the probability
of failure during the next operational period within a statistical confidence level. The study
provides a framework for the identification of techniques that can be used to monitor circuit
board component aging failure modes that could lead to a failure of the circuit.
Background
The nuclear power industry is currently facing increasing obsolescence issues with original
equipment installed for instrumentation, control, and safety system applications. These systems,
frequently more than thirty years old, are experiencing aging-induced failures in electronic
boards and components. These failures can cause plant trips and reduce the reliability and
availability of systems. Most plants take a policy of running to failure and/or periodic
replacement—frequently without a good technical basis. Both of these approaches can be very
costly. The industry needs a better understanding of the aging mechanisms and observable
precursors to failure along with more cost-effective aging inspection, mitigation, and other aging
management technologies.
Objectives
• To classify the failures of electronic boards and components used in I&C systems according
to the type of measurable failure mode conditions
• To apply reliability criteria to the components to understand the likelihood of failure using
existing data sources
• To propose potential measurement tools and models of failure for input into a condition
monitoring and operational assessment process for predicting I&C board failures.
Approach
The project team reviewed the failure modes identified in EPRI reports 1003568 “Collected
Field Data on Electronic Part Failures and Aging in Nuclear Power Plant I&C Systems” and
1008166 “Guidelines for the Monitoring of I&C Electronic Components,” as supplemented by
technical papers, IEEE reliability meetings, and contacts with utility people. The team used data
on failure modes from Military Handbook 217F to define the likelihood of failure for I&C
electronic components. The team defined monitoring methods and techniques based on currently
used methods and methods that have been used in laboratories for circuit board testing, for
computer hardware testing, software verification and database integrity assurance.
v
Results
The report describes potentially useful techniques for monitoring the aging of I&C boards. The
techniques have been grouped into six methods: periodic testing, reliability modeling, resistance
measures, signal comparison, external (passive) measures, and internal (active) measures, each
representing distinct theoretical approaches to detection and evaluation. Each technique has
significant advantages and disadvantages. The design of hardware and software monitoring
systems increases in complexity as the methods become more precise in their ability to measure
aging factors, but the technical tools that can be applied to monitoring within the methods have
also clearly improved within the last few years as computers and networks have been enhanced
to rapidly process large amounts of data.
The report provides a decision process for selecting those circuits and components that could
benefit from an upgraded approach for monitoring the effects of aging and highlights areas
where future R&D is needed to establish firm recommendations for I&C systems. The report
also assesses the relative costs and technical benefits of upgrading circuit-monitoring systems.
EPRI Perspective
As the nuclear power industry is facing increasing aging and obsolescence issues, one area that
needs attention is the aging of electronic boards and components used in I&C systems. Existing
methods of functional testing of I&C systems typically detect circuit failures after they occur
whereas the new monitoring techniques provide indications of failure while the circuit is still
functional. This information will make it possible to maximize the operating life of components
without suffering circuit failure.
This report presents a number of specific techniques for improving the ability to monitor aging
induced changes in circuits and board components that could lead to board failure. Some
promising techniques are discussed that have been used in applications outside of electronic
circuit board monitoring. Additional R&D efforts are needed to test, confirm, and demonstrate
the viability of circuit board monitoring techniques for use as a predictive tool to detect aging
induced changes that can lead to circuit failure. Additional engineering studies need to be
completed to better quantify the implementation and operational costs and benefits of the viable
techniques and to provide sufficient justification for their implementation.
Keywords
Instrumentation and control systems
Electronic boards and components
Electronic board aging
Electronic components aging
Circuit boards
Aging management
Reliability
Predictive maintenance
vi
ABSTRACT
This report addresses understanding the effects of aging on electronic instrument and control
(I&C) circuit boards in nuclear power plants. The issue is that circuit boards used in I&C
systems may suffer from aging failures that can cause a plant trip or unavailability of plant
systems. The overall objective is to determine how precursors of failures in I&C circuit boards
can be measured, and how these measures can be used to estimate the probability of failure
during the next operational period within a statistical confidence level. This initial study
provides a framework for identification of techniques that can be used to monitor circuit board
component aging failure modes that could lead to a failure of the circuit. This study has three
tasks:
1. Provide (1) a review of information on circuit board aging failure descriptions, and (2) an
identification of reliability data for I&C component failures.
2. Propose and review six uniquely different methods and associated techniques that could be
applied to measure and predict the effects of aging within I&C boards and circuits.
3. Provide a systematic framework for deciding how to select circuit boards as well as methods
for improving the process for monitoring aging effects of circuit boards in nuclear power
plants - the framework includes a relative cost benefit assessment for each technique.
Many causes of I&C circuit board failure progress slowly. This opens the possibility for
measuring the impacts of aging progression prior to complete failure. Measures of changes in
electrical characteristics provide a basis for estimating the probability of failure during the next
operational period. Simulation of the aging process can be used to produce a statistical
confidence in the probability estimate. Such information can be used to support optimized
maintenance planning and decisions.
The ideal result of this project is to define a framework for selecting techniques for aging
management that can be applied to any circuit. The techniques should be easy to use and
account for various modes of circuit component aging. This is an initial study to examine a wide
range of issues and approaches for aging management in electronic systems. The existing state
of the I&C systems is that most nuclear power plants were designed using analog control circuits
and relay controlled safety circuits. Hardwired electric relays have become obsolete, as
electronic circuits rely on integrated circuits and software controls to accomplish the same
functions. Current technology permits software logic to replace relay logic and analog controls
to activate safety and control circuits. Therefore, circuit boards can become obsolete in less than
a decade and as this older equipment fails, the spare parts inventories become depleted and
failures can’t be easily repaired. Then there is an increasing need for older technology I&C
systems to be upgraded or replaced. Even new solid state, integrated circuits and systems can
become obsolete in a few years. Older nuclear power plants are in various stages of identifying
vii
obsolescence, upgrading and replacing I&C systems. As a result the key control and protection
systems in nuclear plants use a range of technologies in different plants, and even within the
same plant. Thus, the state of I&C technology within plants may be considered as fragmented
with respect to replacement components and the amount of digital versus analog I&C systems.
In an absolute sense the rate of failure and replacement for electronic circuit boards may not be
considered high. However, from a regulatory point of view, and also relative to other major
plant systems, the I&C system repair and replacement rates are relatively high. For example,
individual circuit boards are typically repaired or replaced several times during the life of a plant.
Therefore, this higher rate of circuit board replacements makes them of low concern as an aging
issue in plant license extension, whereas, insulation on the wires connecting the I&C systems is a
required aging issue that must be addressed in license extension applications.
Reviewing the descriptions of aging failures in circuit boards provides some very valuable
insights. For example, a conclusion from review of information from EPRI (EPRI 2002, EPRI
2003 and EPRI 2004) is that the failure end state of most electronic components is either an open
or short circuit. These findings help simplify the design of potential measurement systems by
permitting the monitoring of each board circuit to be treated as an equivalent circuit with
measurable electronic parameters such as voltage, impedance, resistance, current, and ground
resistance. Changes in these parameters become precursor indications of degradation that could
lead to a complete failure.
Aging induced failures (due to temperature, operating stress, quality of components, corrosion,
and environment) are slow and many intermediate states of partial failure exist. Changes in the
electrical parameter signals from the circuit can be measured before an inoperable condition is
reached. In the case of a rapid shock induced failure (e.g., high voltage spikes, rapid corrosion or
high temperature effects from fire, etc.) the time between the triggering condition and the
component failure would be too short for corrective action to be taken before a failure.
A variety of technology and software methods can be used to develop improved monitoring,
including continuous circuit monitoring, and active as well as passive testing approaches.
1. The EPRI reports developed by EdF (EPRI 2002 and EPRI 2004) show the impact of
component failures involving capacitance and inductance that are sensitive to frequency
variation tests.
2. Simple voltage tests could easily identify circuits that are drifting toward shorts or open
circuits.
Replacement of an aging component with a spare board may not always be possible, because of
obsolescence. If this is the case, then the ability to locate the failed component supports the
process of replacing individual components on a board when replacement circuit boards are not
available. This circuit board reconditioning can be enhanced by early identification of precursor
failures.
viii
Considerable failure data are available for electronic components. These data are complied as
failures per unit time for like components. The grouping process reduces the details of the
failure modes in MIL-HBK-217F (i.e., descriptions of the failure modes typically found in event
data and root cause analysis are subsumed into the term “failure” so that the detail is lost). The
MIL-HBK-217F data can be assigned with some judgment to the individual board components.
Since a board or circuit contains a combination of many components, board failure rates can be
approximated by a sum of failure rates for the components following the recommended
methodology in MIL-HDBK-217F.
It should be noted that information about aging failures of individual components of circuit
boards has been presented previously, and therefore is not repeated in this report. References
such as EPRI 2002 and EPRI 2004 identified the families of components that are of greatest
concern for aging and failure, described component technologies typically found in nuclear plant
circuit boards, and presented field data from a number of plants. These field data included
failure analysis reports and visual inspections of boards and components. Root causes of failures
are also presented, along with photos in many cases, to provide examples of various failure
types. Furthermore, the reports describe postulated aging mechanisms and identify early aging
indicators, which may be measured in a simple manner. Therefore, any reader who wishes to
review detailed examples of various types of board and component failures should refer to these
prior reports.
Potentially useful techniques for monitoring the aging of I&C boards are presented in Section 4.
The techniques have been grouped into six methods of periodic testing, reliability modeling,
resistance measures, signal comparison, external (passive) measures, and internal (active)
measures, representing unique theoretical approaches for detection and evaluation. The technical
tools that can be applied to monitoring within the methods have clearly improved within the last
few years as computers and networks have been enhanced to rapidly process large amounts of
data. Each technique has significant advantages and disadvantages. Human inspections can
detect a surprisingly large group of aging issues, but unfortunately, most likely after the failure
occurs. As the methods become more precise in their ability to measure aging factors, the design
of the hardware and software monitoring system increases in complexity.
Section 5 provides a decision process for selecting those circuits and components that could
benefit from an upgraded approach for monitoring the effects of aging. This investigation of
providing a systematic decision process has highlighted areas where responses to the decision
element questions are not yet precise enough to establish firm recommendations for systems.
Rather, technical questions that should be answered have been identified for future R&D. To
develop a cost effective program for upgrading the circuit monitoring systems, the relative costs
and technical benefits are assessed to help refine the recommendations for the next steps. These
are provided in Section 6.
I&C maintenance personnel and managers can benefit from this report. The business objective
for developing I&C board monitoring systems includes identification of circuit board aging
degradation before board failure, which could cause an unplanned outage or power reduction.
ix
The value of upgrading the existing functional testing process, with techniques for monitoring
the aging effects of components within the circuit, is expected to become a higher priority with
increasing circuit failures due to aging. The existing methods typically detect circuit failures
after they occur whereas the new monitoring techniques provide indications of failure while the
circuit is still functional. This will give operators information upon which to maximize the
operating life of components without suffering circuit failure.
Most monitoring techniques have been applied to mechanical equipment or to a process, but not
to circuits that control the mechanical equipment or process. Thus, a need exists for integrating
the hardware and software components necessary to monitor aging of components within various
circuit types according to the use of a specific technique.
Once the system operation of a circuit monitoring technique(s) is developed and demonstrated it
will find wide applications in power plants for circuits that are classified as important to plant
operation.
x
ACRONYMS
xi
PM—Preventive Maintenance
POF—Physics of failure
PSAs— Probabilistic Safety Assessments
PTH —Pin through Hole (PTH) solder joints
R&D —Research and Development
RAM —Random Access Memory
ROM —Read-Only Memory
RTD—Resistance Temperature Detectors
SCSI —Small Computer System Interface
SMT — Simultaneous Multi Threading (SMT) solder joints
SRAM —Static RAM
SSCs— Structures Systems or Components
SSR—Solid-State Relay
US —United States
xii
CONTENTS
1 INTRODUCTION ....................................................................................................................1-1
Bathtub Curve .......................................................................................................................1-1
Foundations for Addressing Aging ........................................................................................1-2
Application of a Periodic Test Interval ..............................................................................1-3
Reliability Modeling...........................................................................................................1-3
Condition Monitoring and Operational Assessment .........................................................1-4
Continuous Monitoring......................................................................................................1-4
Framework for Improved Aging Monitoring ...........................................................................1-5
3 RELIABILITY DATA...............................................................................................................3-1
Component Specific Failure Data..........................................................................................3-1
Event Data Sources ..............................................................................................................3-2
Failure Rate Equations..........................................................................................................3-2
xiii
Example Application ....................................................................................................4-5
Method 3 Measures of Resistance .................................................................................4-13
Ground Resistance Testing .......................................................................................4-13
Leakage Current Testing ...........................................................................................4-13
Advantages:...............................................................................................................4-14
Disadvantages: ..........................................................................................................4-14
Method 4 Signal Comparison Measures ........................................................................4-15
Advantages:...............................................................................................................4-15
Disadvantages: ..........................................................................................................4-15
Method 5 Passive Measurement Systems .....................................................................4-16
Circuit Parameter Measures ......................................................................................4-16
Environmental Parameter Measures .........................................................................4-17
Advantages:...............................................................................................................4-17
Disadvantages: ..........................................................................................................4-17
Method 6 Active Measurement System..........................................................................4-18
Current or Voltage Change ........................................................................................4-18
Signature Analysis .....................................................................................................4-19
Pattern Recognition ...................................................................................................4-19
Frequency Analysis ...................................................................................................4-19
Advantages ................................................................................................................4-20
Disadvantages ...........................................................................................................4-20
Review of Methods and Techniques ...................................................................................4-21
xiv
Control Circuits ............................................................................................................5-6
Protection Circuits........................................................................................................5-8
Decision Element 3: Detectability ...................................................................................5-10
Detection Methods .....................................................................................................5-11
Monitoring Approaches ..............................................................................................5-11
Decision Element 4: Predictability ..................................................................................5-11
Reliability and Physics of Failure Models...................................................................5-12
Statistical Correlations ...............................................................................................5-12
Synopsis ....................................................................................................................5-12
Decision Element 5: Repairability...................................................................................5-13
Existing Repair or Replacement Process ..................................................................5-13
Improved Repair or Replacement Process ................................................................5-13
Decision Mapping Logic ......................................................................................................5-14
Summary of Decision Process Considerations for Utilities.............................................5-15
Importance .................................................................................................................5-15
Observability ..............................................................................................................5-16
Detectability ...............................................................................................................5-16
Predictability ..............................................................................................................5-17
Repairability ...............................................................................................................5-17
Illustration of Decision Process ......................................................................................5-17
Resistance Components............................................................................................5-17
Capacitor Components ..............................................................................................5-18
Integrated Circuits......................................................................................................5-18
xv
Signal Change Testing .....................................................................................................6-7
Signal Comparison Measures ..........................................................................................6-7
Signature Analysis Testing ...............................................................................................6-7
Pattern Recognition and Frequency Analysis Testing......................................................6-7
7 FINDINGS...............................................................................................................................7-1
Summary ...............................................................................................................................7-1
Recommendations ................................................................................................................7-2
8 REFERENCES .......................................................................................................................8-1
xvi
LIST OF FIGURES
xvii
LIST OF TABLES
xix
1
INTRODUCTION
An important metric commonly used to measure and specify the lifetime of electronic
components and circuit boards is the mean time between failures (MTBF). This is the mean time
until the group of devices will fail. The MTBF is a function of the failure rate of the circuit
board and the components on it.
Bathtub Curve
The failure rate for most modern electronic components has a distinctive "bathtub" curve that
represents their failure characteristics (Kumamoto and Henley 1996, Wowk 1991, and Ireson and
Coombs 1988). The bathtub curve in Figure 1-1 provides a means for discussion of the
characteristics of the statistical failure rate during three phases of the component life burn in,
useful life, and aging dominated
Constant failure
rate
Aging failure Rate
Time
Figure 1-1
Example Bathtub Failure Rate Curve
1-1
Introduction
As shown in Figure 1-1 during the early life of the component (referred to as the burn-in phase),
it's more likely to fail due to the initial manufacturing defects and introduction of damage during
assembly and testing. The initial testing of electronic components uses high temperatures to act
as a time accelerator to verify limits of failure conditions and eliminate obvious defects in the
subsequent manufactured devices.
Once this initial burn in phase is over, through factory tests and initial testing at the site, a
device's overall failure rate typically remains quite low for a number of years. This MTBF or
useful life is expected to last more than ten years for electronic devices built in the 1980s, and
operated within specified limits for the entire time period.
The useful life ends when the failure rate increases due to age related failures. Examples of age
related failures include insulation breakdown, increases in current leakage, loss of resistance and
loss of capacitance. Aging is impacted by long term stress from voltage differential, voltage
cycles on specific components, and other factors.
For the purposes of this document it is assumed that the burn in failure period has passed, and
components on the circuit boards are either in the constant failure rate period or transitioning
from the useful lifetime into a period of increasing failure rate due to aging. The impact of
failures during these periods is that circuits controlling the operation of equipment in the plant
can fail in ways that can cause normally operating equipment to operate spuriously and then, for
example, could cause controlled equipment such as a turbine generator to reduce power or go off
line without advanced notice. In the case of safety systems the actuation signal might become
unavailable due to aging failures on the electronic boards or operate spuriously. Such spurious
actions can result is significant costs to the plant in lost revenue from generation, or extended
down times for trouble shooting the cause of the failure. With these assumptions in mind the next
step is to establish rationale foundations for addressing aging circuit boards that could lead to the
identification of improved methods for limiting the impact of failures on the safety and operation
of plants.
1
DiSandro and Torok 2005 state that “There is no accepted methodology to quantify the failure probabilities of a
control operation operating within a multitasking operating system environment.”
2
Condition monitoring assessments use past performance to evaluate likelihood of failure. Operational assessments
define the failure probability over next operating cycle and may include statistical uncertainty bands.
1-2
Introduction
Plant protection systems are subject to periodic surveillance that detects failed modules. This
application currently involves periodically testing the plant protection circuits during outages or
by testing one of the redundant circuits during operation. When a circuit is found to be out of
specification for its intended use, an action is taken such as recalibrating or replacing the
controlling circuit boards (if not obsolete). If replaced from stock, the failed printed circuit
board (PCB) is typically repaired and returned to stock (EPRI 2003). This rationale is known as
wait until a failure is discovered, and then restore the circuit, which is also called “run to failure.”
In some cases an aging failure can progress to a complete failure between the test intervals. The
test interval is usually found in the technical specifications. Because of long life of many circuit
boards the test interval in the technical specifications is on the order of 18 to 24 months to
coincide with the refueling interval. Hence, if there is no pre-warning time of failure, the only
question becomes, “Is the circuit operable as tested or is it failed?” Using this rationale, circuit
components that pass a functionality test with significant aging degradation can be placed back
into service with no understanding of the degree of degradation.
Reliability Modeling
A second rationale for aging management of circuit boards is to use either the manufacturers
recommended replacement schedule or calculate board specific MTBFs to establish a generic
replacement schedule. Use of a replacement schedule defined by the circuit board MTBF relies
on failure rate data from a population of similar boards. Determination of an MTBF then relies
on generic information that might not apply to the actual operational conditions for the circuit
board components in the plant. Thus, a large uncertainty must be applied to the MTBF and, if a
90% confidence level is used, a board with a nominal MTBF of 15 years might be replaced at 3
or 4 years. This can cause the cost of maintenance for aging failures to be more than actually
needed.
Reliability models for electronics, including MIL-HDBK 217F data and models, can be used to
predict the life of many electronic systems, such as computers and consumer goods, which may
have a short life, typically two to five years, and critical applications, such as in the case of
power plants, where life may be 15 years or more (EPRI 2003). However, when using this
rationale, there is no warning time for a specific component failure on a specific board, and
failures during the next test interval can trigger unplanned plant outages.
In typical reliability models component failures are assumed to be "random", i.e. exponentially
distributed. A model such as this has no "infant mortality" or "wear out" region. Physics of
failure (POF) approaches can describe the "life" of electronic products where life is limited by
predictable physical mechanisms (Ireson and Coombs 1988). These models work well, e.g., for
the case of solder joint fatigue by thermal cycling. However, POF approaches are not adequate
for large systems, with disparate part types, where the environmental conditions are benign (i.e.,
there is no failure forcing stress on the component).
In power plant applications, circuit boards are typically operated in a control room with carefully
controlled environmental conditions. So, EPRI (EPRI 2002 and EPRI 2004) conducted a large
measurement program on samples of aged control circuit boards to determine what components
1-3
Introduction
age, and by how much, over 20 years or more. The results, in concert with additional analysis,
can be used to establish maintenance strategies for control systems using similar components,
which exhibit the same failure mode and rate characteristics.
A third foundational rationale for board replacement begins with the initial recommendation for
replacement, and based on an evaluation of measures of a specific circuit condition and on use of
the information in models that predict the probability of board failure adjusts the replacement
interval. This type of proactive maintenance is normally practiced for mechanical components,
but no references to such proactive methods have been found in the technical literature for
electronic systems (Loman et. al. 2003).
Condition monitoring relies on measurable test information given from a circuit to develop
inputs into a model for calculating the probability of circuit failure. In the case of condition
monitoring and evaluation of the probability that the circuit failure would have occurred over the
last cycle can be calculated. In the case of operational assessment the probability of failure
during the next cycle can be estimated. Given this information the board could be left in place or
replaced proactively.
This rationale requires measures upon which to predict the likelihood of failure in the next
operating cycle. The predictions rely on statistical models, which may be derived from reliability
models, but are adapted to a specific circuit through a correlation between the measured
information and inputs to the failure probability quantification. An important question is then
how much warning time is given between the measurement of an aging failure condition and loss
of the circuit function? Improved circuit availability is possible, if the warning time for circuit
replacement is short relative to the test interval based on refueling outages3. The improved
warning time based on anomalous indications is a basis for replacement before failure.
Continuous Monitoring
A fourth rationale for determining the board replacement interval is to provide a means for
continuous monitoring so that the results of the condition monitoring calculation and the
operational assessment are the same. This occurs when the time interval between measurements
approaches zero. If rapid changes in current flow are observed, and these changes are interpreted
as precursor indications by trending, then this indicates that the maximum life of a component on
a specific board is approaching or has been reached.
This rationale for board replacement has not been applied by utilities, but represents a more
idealized approach that could be established. It provides a warning of the aging impact long
before the failure would occur. It is compatible with a philosophy of replacement just in time
and yet allows the operators to maximize the useful circuit board life. It would require
specialized monitoring equipment, and may be cost effective when applied to key circuits.
3
Typical refueling intervals are in the range of 18 to 24 months in US nuclear power plants.
1-4
Introduction
Use of continuous monitoring could be one reason to select industrial grade boards in the
important circuits. This would replace savings in one area with the increased cost of constant
monitoring. In NEI, 2001b, the cost of a board that meets nuclear standards is about five times
the cost of an industrial grade board that can perform the same function for which it is intended.
The industrial-grade electrical circuit board could be purchased for $1,160, or $5,700 for the
same circuit board under nuclear standards. “The main difference in cost is the extent of the
process used to verify the component’s performance capability. Commercial industrial standards
are entirely satisfactory for many applications with low safety-significance in nuclear power
plants. In fact, they already are widely used in these facilities. Their use could be expanded
substantially, and it simply makes sense to do so” (NEI 2001b).
A framework for integrating the issues related to upgrading circuit board functional testing to
include aging monitoring is needed. Figure 1-2 provides a framework for considering how new
techniques can be considered for application in existing plants and what type of monitoring
might be associated with the technique. It also includes deciding if it is necessary to upgrade the
aging detection process.
Prediction from
Select Circuit Methods for direct signals
(analog - digital protection circuits
control - 1, 2, 3, 4, 5 , & 6 Repair based on
protection) direct measures
Direct signal
comparison
Circuit board
component Non Critical Cost effective Detect end state
reliability data methods 1 & 2 failure or schedule
card repair on MTBF
Figure 1-2
Framework for Considering Improved Circuit Board Monitoring for Aging
1-5
Introduction
In general after choosing a specific circuit, its importance is assessed in the next column, and
then from the potential methods factors such as observability and detectability of failure modes
are considered. Finally, in the process monitoring column factors such as predicting failure and
the timing of restoring, repairing, or replacing are considered.
The following sections address these elements in more detail leading to example decision trees
and a relative cost benefit review. The next section reviews aging component failure modes on
circuit boards and the use of various methods for monitoring failure modes recognized in
electronic components by examination of used circuit boards (EPRI 2002 and EPRI 2004).
1-6
2
CIRCUIT BOARD AGING FAILURES
This section provides an important input to the framework. It begins with a listing of the
components that are found on circuit boards and are subject to aging failures. This section also
provides progressive failure mode descriptions for categories of component types.
Component Listing
The components analyzed in EPRI 2002 and EPRI 2004 are shown in Table 2-1. These were
identified as important by a survey of nuclear power plant I&C personnel. Definitions for these
components are provided in Appendix B. It is necessary to understand the details of each
component in order to assess how a method or technique for monitoring aging could be applied.
Table 2-1
Listing of Components Typically Found on Circuit Boards4
1 Capacitors
2 Relays
3 Potentiometers
4 Edge board connectors
5 Power diodes and transistors
6 Transformers and inductive devices
7 Thyristors
8 Integrated circuits
9 Printed circuits
10 Signal diodes and transistors
11 Regulators & other analog circuits
12 Optocouplers
13 On board connectors
14 Fixed resistors
15 LED
16 PTH solder joints Pin through Hole (PTH)
17 DC/DC converters
18 SMT solder joints Simultaneous Multi Threading (SMT)
19 LCD
20 Switches and keyboards
21 Quartz
4
Analyzed in EPRI reports (EPRI, 2002 and EPRI 2004)
2-1
Circuit Board Aging Failures
The next step was to identify detailed descriptions of effects of failure, root causes and failure
modes from EPRI reports and other sources for the components in Table 2-1.
2-2
Circuit Board Aging Failures
Table 2-2
General Effects of Aging on Resisters, Capacitors, Transformers and Other Components on I&C Electronic Boards
Input voltage is out of spec. Out put voltage out of spec. Loss of output voltage
Loss of heat sink (thermal resistance increase, thermal dissipation Long term reliability is affected (thyristors do not
problems may occur) trigger properly) Loss of output voltage
Short circuits between all leads Cathode can turn around itself Anode to cathode short circuit
Table 2-3
General Effects of Aging on Integrated Circuits on Chips on I&C Electronic Boards
Weak supply current, Unstable supply pin loss of memory @ FF… Open circuit inside the integrated circuit
Supply current is too high, Memory cannot be reburned Short circuit between the an output pin and circuit board
Leakage current of an output transistor is out of spec. Checksum is does not agree with standard Loss of program instruction
High temperature operation Gates have unstable output Partial destruction of the wire bond component
Corrosion induced intermetallic layers growth at interfaces Output voltage of a gate oscillates Failure open or short circuit
2-3
Circuit Board Aging Failures
Table 2-4
General Effects of Aging on Printed Circuit Electronic Boards
Due to cracks Resistance increase Open circuits on the printed circuit board
Due to vibration Inverse leakage current is out of spec (> 10 µA) Short or open circuit
Due to corrosion Shift in wave form input -output Short or open circuit
2-4
3
RELIABILITY DATA
An element of the framework is to apply reliability modeling. Thus, the next step is to establish
a reliability database that can be used for failure predictions on typical instrument and control
(I&C) boards. In Table 3-1 the collected circuit board failure rates from grouped data sources
are compared with the list of components in Table 2-1. The failure rates discussed in this section
generally apply to the useful life period as previously illustrated in the bathtub curve of Figure 1-1.
The data sources for the nuclear power plant electronic components listed in Table 3-1 come
from EPRI 2002 and EPRI 2004. They describe failure modes and root causes that have been
observed in I&C boards and electronic components. In order to bridge between the list of
components in Section 2 and component failure rate data, a clear description and function for
each component is needed. Expanded component definitions are provided in Appendix B. The
reliability data follow the data collected for military applications and other applications that have
been collected and presented in the literature.
The estimates of electronic equipment failure rates can be obtained from the reliability data in
Table 3-1. These data represent the results of accelerated aging tests that use high temperatures,
excessive voltage stress, and many voltage, vibration and impact cycles to represent long term
aging.
The recommended environmental values for each type of component assume the following
conditions for power plant applications.
• The quality of the components is best commercial (quality)
• The operation is in a clean filtered air, fixed ground location (environment) with low
mechanical vibration.
• The components have been proven under use for at least two years in other applications
before installation (learning).
• The voltage is never more than 70% of the rated value (stress), for example for a fifty volt
rating the operating voltage never exceeds 35 volts. In many circuit board operations the
basic source voltage is 3 or less volts.
• The operating temperature is less than or equal to 40oC (temperature).
3-1
Reliability Data
The failure rate data must be adjusted for conditions that differ from the above assumptions. The
weakness in these data is that neither the impact of the failure nor the root cause is described in
detail. The data typically describe only the failure rate or number of failures per time period.
λ p = λb • π Q • π E • π A • • •
Where: λb = the base failure rate, is described by the Arrhenius equation, and
πQπΕπΑ … = factors related to component quality, environment, and application stress.
The Arrhenius equation illustrates the relationship between insulation breakdown rate and
temperature for components. This application has been derived from the observed dependence of
chemical reactions on temperature changes.
E
−
R (t ) = Ae κ •T
3-2
Reliability Data
Table 3-1
Assigning Failure Rate Data for Electronic Components in MIL HBK 217F to EdF
Components in Table 2-1
MH217F Adj.
Learning
Environ.
EdF Field Data From
Quality
MIL-Handbook 217F Electronic Comp.λ Comp. λ
ΠQ
ΠE
ΠL
Analyses &
Component Generic Names x1. E-9 x1. E-9
Inspections
Per Hr Per Hr
Edge board
connectors Conn IC socket 5.4 1 2 2 22
Edge board
connectors Conn Clip terminators 0.12 1 2 2 0
Edge board
connectors Conn comp 0.26 1 2 2 1
Power diodes &
transistors DIODES Power Rectifier 3 0.7 2 2 8
Power diodes &
transistors DIODES General Purpose Analog 3.8 0.7 2 2 11
3-3
Reliability Data
MH217F Adj.
Learning
Environ.
EdF Field Data From
Quality
MIL-Handbook 217F Electronic Comp.λ Comp. λ
ΠQ
ΠE
ΠL
Analyses &
Component Generic Names x1. E-9 x1. E-9
Inspections
Per Hr Per Hr
3-4
Reliability Data
MH217F Adj.
Learning
Environ.
EdF Field Data From
Quality
MIL-Handbook 217F Electronic Comp.λ Comp. λ
ΠQ
ΠE
ΠL
Analyses &
Component Generic Names x1. E-9 x1. E-9
Inspections
Per Hr Per Hr
3-5
Reliability Data
MH217F Adj.
Learning
Environ.
EdF Field Data From
Quality
MIL-Handbook 217F Electronic Comp.λ Comp. λ
ΠQ
ΠE
ΠL
Analyses &
Component Generic Names x1. E-9 x1. E-9
Inspections
Per Hr Per Hr
Integrated circuits
programmable Bipolar ROM 1MB 53 1 2 2 212
Integrated circuits
programmable Bipolar ROM 256K 28 1 2 2 112
Integrated circuits
programmable Bipolar ROM 64K 17 1 2 2 68
Integrated circuits
programmable MOS PROM 1MB 12 1 2 2 48
Integrated circuits
programmable MOS ROM 1MB 11 1 2 2 44
Integrated circuits
programmable Bipolar ROM 16K 10 1 2 2 40
Integrated circuits
programmable MOS PROM 256K 7.2 1 2 2 29
Integrated circuits
programmable MOS ROM 256K 6.7 1 2 2 27
Integrated circuits
programmable MOS PROM 64K 6.1 1 2 2 24
3-6
Reliability Data
MH217F Adj.
Learning
Environ.
EdF Field Data From
Quality
MIL-Handbook 217F Electronic Comp.λ Comp. λ
ΠQ
ΠE
ΠL
Analyses &
Component Generic Names x1. E-9 x1. E-9
Inspections
Per Hr Per Hr
Integrated circuits
programmable MOS ROM 64K 5.9 1 2 2 24
Integrated circuits
programmable MOS PROM 16K 4.9 1 2 2 20
Integrated circuits
programmable MOS ROM 16K 4.7 1 2 2 19
Printed circuits Electronic filter discrete LC comp 120 1 2 2 480
3-7
Reliability Data
MH217F Adj.
Learning
Environ.
EdF Field Data From
Quality
MIL-Handbook 217F Electronic Comp.λ Comp. λ
ΠQ
ΠE
ΠL
Analyses &
Component Generic Names x1. E-9 x1. E-9
Inspections
Per Hr Per Hr
3-8
Reliability Data
MH217F Adj.
Learning
Environ.
EdF Field Data From
Quality
MIL-Handbook 217F Electronic Comp.λ Comp. λ
ΠQ
ΠE
ΠL
Analyses &
Component Generic Names x1. E-9 x1. E-9
Inspections
Per Hr Per Hr
3-9
4
METHODS FOR DETECTING I&C BOARD FAILURES
This section provides an overview of the framework element for selecting methods for
monitoring the aging of components on circuit boards. The first step is to define methods, which
uniquely capture different categories of theoretical under pinning for aging detection, using the
four rationales from Section 1. As shown in Table 4-1 six broad method categories are defined
and presented in order of increasing technical complexity.
Table 4-1
Summary of Methods for Detecting and Monitoring Aging in Circuit Boards
Condition
Defined monitoring and
periodic test Reliability operational Continuous
# Method Applicability interval modeling assessment monitoring
4-1
Methods for Detecting I&C Board Failures
The basis for listing a technique is the expectation that within the theoretical method there can be
various means for monitoring circuit behavior that can provide a better way for deciding when to
repair or replace the circuit board. The six basic theoretical methods listed in Table 4-1 are
discussed below. Within each method, where appropriate, techniques are identified that provide
alternative technical approaches for monitoring aging.
There are two techniques for periodic inspections within this theoretical method, functional
testing and visual inspections. The theory for detecting aging in circuit board components with
periodic inspections is that the aging condition produces an observable measure during the test
such as an increase in the time for circuit actuation, or in the case of visual inspection a color
change somewhere on the printed circuit board (PCB). Most PCBs are operated under the run-
to-failure philosophy. With this philosophy, PCBs may be monitored for proper operation, but no
attempt is made to enhance the PCB life until the PCB fails to function. Since many PCBs
operate beyond their design life, critical or essential boards may need to be refurbished at least
once in a plant’s operating life. Detailed methods of trouble-shooting and refurbishment are
discussed in EPRI 2003.
Functional Testing
Functional surveillance tests are generally specified as operability checks and calibrations in the
plant technical specifications and are considered to be an aging management technique (IAEA
2000). Such tests include circuit checks and evaluation of the results is used to verify that the
entire circuit is capable of operating as it should. For example, some tests measure time of the
signal to device actuation as a measure of the overall circuit and mechanical actuation. If the
time interval exceeds a specified value the system is examined to identify the problem
component. Successful functional tests of a circuit assume that the circuit can be returned to
service even though degradation might exist. This is the most common method used in power
plants. To reduce the potential for returning a degraded circuit board to service, visual
inspections can also be employed.
Visual Inspections
The interval for visual inspections is typically included as part of the technical specification to
coincide with the refueling outage cycle. Visual inspections of circuit boards may involve the
use of aids for detecting anomalies; for example, magnifying glass, microscope, X-Rays, ultra
violet light, etc. Inspections performed with an unaided eye or a low power optical microscope
on I&C boards during manufacturing can detect surface imperfections such as burrs, voids,
nicks, scratches, and gouges (EPRI 2002). They can be quickly identified and compared to a
standard. Inspection of the solder mask material involves investigating blisters, delamination,
bubbles, and thickness. Some subsurface imperfections such as foreign inclusions, voids and
delamination can often be detected from the external visual inspection. The same type of results
can be expected in examination of aging boards.
4-2
Methods for Detecting I&C Board Failures
The types of aging anomalies that can be detected by visual inspection include:
1. Solder connection aging anomalies on printed circuit boards which include: Solder residues,
solder lifted from the circuit board, insufficient solder in joint, cracks or separations of
solder, brown spots around solder joint, holes, loose or broken wires, and solder bridges.
2. Cracked coatings on components such as capacitors, transformers, resistors, memory chips
and processors.
3. Excessive dust or pollution on the board and components.
4. Traces of localized heating by color changes.
5. Traces of corrosion from moisture, chemicals, smoke, or atmospheric exposures.
6. Cleaning process negative results.
7. Laminar separation or bowed circuit boards.
8. Mechanically damaged parts (leads or body).
9. Damaged or missing connectors.
10. Repeated repairs on the same component as an indication of other problems.
Advantages
Aging anomalies can be observed on specific circuit boards without special tools or other costs
for new development.
Lists of observable anomalies are available from manufacturers, EPRI 2002, and others.
Disadvantages:
The frequency of inspections is generally no more frequent than once per refueling cycle, which
can range from 18 to 24 months.
Boards must be removed for inspection, which could damage connectors or cause other handling
induced problems.
Detection by visual inspection is limited to grossly visible characteristics (i.e., the assumption is
that circuit aging conditions will leave an outer trace of damage such as changing the color of the
board in an overheated area).
Many precursor aging failure modes are not observable (e.g., an open circuit in part of the board
might not be detectable by visual inspection alone).
4-3
Methods for Detecting I&C Board Failures
Judgment is required to assess the degree of degradation and whether to take corrective action.
The theory for reliability modeling is that statistical evaluations of the failures of components in
a large population under accelerated aging conditions can be used to generate failure rates that
can be applied to similar components in other applications.
Reliability models for circuit boards typically use statistical analysis of accelerated failure testing
as a starting point. These tests assume that the Arrhenius model can be used to scale the
accelerating conditions back to the conditions of actual operation for each component. Then the
reliability model for a board is based on the sum of the parts on the circuit board. This is how
the parts count databases in MIL-HDBK-217F are established (DOD 1995). Electronics
reliability models including MIL-HDBK 217F can be used to predict the life of many electronic
systems, such as computers and consumer goods that may have a short life, typically two to five
years, and critical applications, as in the case of power plants, where life may be 20 years or more.
The databases for electronics reliability models come from various kinds of accelerated testing.
Chemical corrosion impacts have been related to Weibull functions (Bhakta et. al. 2002) by
modeling metal migration using diluted NaCl solutions, humidity, and temperature variations to
simulate the effects of contaminants. Valentin, et. al. 2003, used simulation of applied
temperature cycles to determine the mean time to failure of solder joint interconnects between
the package leads and printed wiring boards. The methods for analyzing circuit reliability have
been well established in the areas of collecting generic data, using accelerated failure data, and
quantifying the reliability of multiple components on a circuit board. To the authors knowledge
neither reliability nor physics of failure models have been linked to measured parameters from
circuit testing. Such a linkage would require the development of correlations between a
measured circuit test parameter and the failure rate, per mode of failure, which feeds into the
model.
If reliability models are not used, then the default is to replace the circuit board on a periodic
schedule for replacement before failure is expected to occur, or to wait until the I&C board
failure is discovered by periodic testing, spurious trip, or failure to activate a system when
needed.
Advantages
The method produces a probability of failure, if needed for operational assessment or condition
monitoring.
Evaluation of reliability for a specific I&C board can be completed with information about the
components on the board and a spreadsheet calculation model with associated data to make an
MTBF prediction. No additional measurement, signal processing equipment, or special software
is required.
4-4
Methods for Detecting I&C Board Failures
The evaluation process does not interfere with operation of the I&C board.
The results can be adjusted or “tuned” to actual field data, and provide a starting point for
assigning an inspection, test, maintenance or calibration interval for I&C boards.
Disadvantages
The results assume that the specific I&C board is one of many identical boards. The input data
come from grouped failures of similar equipment, and assume that the manufacturing,
operational stress and temperatures are related to the MTBF by the Arrhenius model. Thus, the
results do not treat each board as an individual unique board with actual measures to verify the
operating condition.
The confidence level associated with the MTBF estimate contains large uncertainty and may not
address every issue. Additional judgment is needed to establish the best schedule for inspection
or replacement intervals.
The probability of failure produced applies to a grouped average of boards with similar
characteristics to the one considered for evaluation.
Example Application
The following example board in Figure 4-1 is used to illustrate the fundamental process for
combining the individual failure rates of the components into a failure rate for the board as
shown in Table 4-1. For this circuit board operating in a clean air-conditioned environment, the
total MTBF is expected to be about 68 years as shown in Table 4-2. This estimate does not
consider the cumulative effects of significant corrosion exposures, voltage spikes and voltage
cycling conditions that could reduce the lifetime to several years.
Front Back
Figure 4-1
Example of a Circuit Board With Various Components
4-5
Methods for Detecting I&C Board Failures
Table 4-2
I&C Board Materials, Part Descriptions, and Failure Rate Estimate
Tot
Board
Failure
Part Description Qty Generic Name Failures per 1E9 Hours ΠQ ΠL ΠE Rate
C:NDK, 18.432 PF, 10%, 50V 1 Capacitors, Discrete, Fixed, Ceramic 3.6 1 1.5 0.5 3
C:C, X7R, 0.01 UF, 50V 29 Capacitors, Discrete, Fixed, Ceramic 3.6 1 1.5 0.5 78
Zilog 8838 Processor 1 MOS MicroProc 32 Bit 190 1 1.5 0.5 143
RPK,51,1%,12 ISO,24QSOP 14 Resistors, Networks, Thick or Thin Film 2.3 1 1.5 0.5 24
RECP, 20P, RA, 3.05MM 16 Connectors, Multi-pin, 20 pins #N/A #N/A 1.5 0.5
4-6
Methods for Detecting I&C Board Failures
Tot
Board
Failure
Part Description Qty Generic Name Failures per 1E9 Hours ΠQ ΠL ΠE Rate
RNET,10K,8 BUS,10P 18 Resistors, Networks, Thick or Thin Film 2.3 1 1.5 0.5 16
RES,1206, 220, 5% 40 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
FER 0805,1K,DCR = 0.3 36 Inductive Devices, Coil, RF, Fixed #N/A #N/A 1.5 0.5
FER,SM,0805,200MA,2200 36 Inductive Devices, Coil, RF, Fixed #N/A #N/A 1.5 0.5
RES,0603, 5.1, 5% 36 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
LED,3MM,RA,SQ,Y/G,SM 17 Other Optical Devices, LED/LCD Display #N/A #N/A 1.5 0.5
RES,0603, 4.7K, 5% 33 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
FER BD,SM,1206,3000M 32 Inductive Devices, Coil, RF, Fixed #N/A #N/A 1.5 0.5
RES,0603, 59.0,1% 32 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES,0603,15,1 %,1/16 32 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
4-7
Methods for Detecting I&C Board Failures
Tot
Board
Failure
Part Description Qty Generic Name Failures per 1E9 Hours ΠQ ΠL ΠE Rate
IC,S08,3 TERM ADJ CUR 4 MOS LinearLA 100 Trans 9.5 1 1.5 0.5 10
RESARAY 16P 8R,33 5 Resistors, Networks, Thick or Thin Film 2.3 1 1.5 0.5 4
RES,0603, 1.0K, 5% 25 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES,0603, 10K, 5% 24 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
C:,0603,10PF, 10%, 50V 10 Capacitors, Discrete, Fixed, Ceramic 3.6 1 1.5 0.5 27
LED, SMT, 1206, GRN 12 Other Optical Devices, LED/LCD Display #N/A #N/A 1.5 0.5
RES, 0805, 270, 5% 20 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
FER 5A SMT, MAT 43 18 Inductive Devices, Coil, RF, Fixed #N/A #N/A 1.5 0.5
RES, 0603, 150, 1% 16 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
4-8
Methods for Detecting I&C Board Failures
Tot
Board
Failure
Part Description Qty Generic Name Failures per 1E9 Hours ΠQ ΠL ΠE Rate
RES,0603 5%, 33 12 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
IC, PCF8574,I/O EXPDR 3 Bipolar DLA 100 Gates 3.6 1 1.5 0.5 3
RES, 0805, 49.9, 1% 8 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES,2512,62 OHMS,5% 8 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
REG, LTC1649, 3.3V 1 Bipolar LinearLA 100 Trans 9.5 1 1.5 0.5 3
RES, 0603, 2.2K, 5% 6 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0805, 75, 1% 6 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
CLK DRV, 74FCT3807A 1 MOS DLA 100 Gates 5.7 1 1.5 0.5 2
IC RS232, MAX232, SMT 1 MOS DLA 100 Gates 5.7 1 1.5 0.5 2
4-9
Methods for Detecting I&C Board Failures
Tot
Board
Failure
Part Description Qty Generic Name Failures per 1E9 Hours ΠQ ΠL ΠE Rate
IC,DS1307, I2C TIMEKPR 1 MOS DLA 100 Gates 5.7 1 1.5 0.5 2
FER,120,200MA,0603 SM 5 Inductive Devices, Coil, RF, Fixed #N/A #N/A 1.5 0.5
RES, 0603, 47, 5% 5 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
C: 0603, NP0, 22PF, 10% 2 Capacitors, Discrete, Fixed, Ceramic 3.6 1 1.5 0.5 5
HDR,24P,R/A,PCB MT, BMI 2 Connectors, Multi-Pin, 10 Pin #N/A #N/A 1.5 0.5
RES, 0603, 549, 1% 4 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0603, 210, 1% 4 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0603, 475, 1% 4 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0805, 56, 5% 4 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0603, 0, 5% 3 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES,1206, 5%, 100 3 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
4-10
Methods for Detecting I&C Board Failures
Tot
Board
Failure
Part Description Qty Generic Name Failures per 1E9 Hours ΠQ ΠL ΠE Rate
HDR, .025 SQ, 5X2, SMT 1 Connectors, Multi-Pin, 10 Pin #N/A #N/A 1.5 0.5
RES, 0603, 1.5K, 5% 2 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES,0603, 100K, 5% 2 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
CN RJ45 SHLD W/O FERRITE 1 Connectors, Multi-Pin, 9 Pin #N/A #N/A 1.5 0.5
HDR 10 P,RA,PCB MT,BMI 1 Connectors, Multi-Pin, 9 Pin #N/A #N/A 1.5 0.5
RES, 0603, 11K, 1% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0603, 18K, 5% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0603, 2.7K, 5% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0603, 4.87K, 1% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0603, 69.8K, 1% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 0805, 100, 1% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
4-11
Methods for Detecting I&C Board Failures
Tot
Board
Failure
Part Description Qty Generic Name Failures per 1E9 Hours ΠQ ΠL ΠE Rate
RES, 0805, 22, 5% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES, 2010,100, 5% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
RES,0603, 1 MEG, 5% 1 Resistors, Fixed, Film, <1M #N/A #N/A 1.5 0.5
4-12
Methods for Detecting I&C Board Failures
Voltage stress and temperature adjustments have not been made in the example model of Table
4-2. Such adjustments can be applied on a component basis (e.g., accounts for high operating
temperatures within the board), or globally for all components (e.g., accounts for environment
temperature).
It can be seen from this example that the standard estimate, assuming that the operating
conditions are ideal, produces a very optimistic MTBF, and data to support changes in the model
are necessary for the model to be used to monitor changes to the MTBF as components in the
circuit age.
The theory for resistance measures is that over time slow chemical reactions within the insulating
material lead to lower resistance pathways between the conductor and the ground (See Appendix
A). This change is measurable through simple resistance measures or by noting the increase in
leakage current.
Monitoring the change in resistance to ground over time can identify the onset of a circuit ground
condition as insulation breaks down. The processes for measuring changes in resistance to
ground include measuring leakage current from the circuit when the circuit is isolated by
applying a higher than normal voltage source or measuring the current to the board during
normal operation. Insulation resistance testing is an established technique for trouble shooting
many electrical systems (Fink and Beaty 1993), which is adaptable to circuit boards.
The ASTM definition of dielectric breakdown voltage is: the potential difference at which
dielectric failure occurs under prescribed conditions in an electrical insulating material located
between two electrodes. This is permanent breakdown and is not recoverable. ASTM goes on to
state; that the results obtained by this test can seldom be used directly to determine the dielectric
behavior of a material in an actual application. This is not a test for “fit for use” in the
application as is the “Proof Test” which is used for detection of fabrication and material defects
to the dielectric insulation.
Hence judgment in interpreting the results of resistance to ground is required for monitoring
aging. An example would be to plot periodic measures to form a trend line. Then when a critical
resistance is reached, actions should be taken such as cleaning and resealing the board.
Increased leakage current due to component aging or ground insulation deterioration can cause a
shift in the output voltage, and cause over heating. This could create intermittent operation.
Measures of the leakage current on a periodic basis and plotted in a trend line would provide an
indication of the need to take action for board replacement. Bench testing equipment has been
4-13
Methods for Detecting I&C Board Failures
used to measure such conditions in circuit boards (EPRI 2004); this method could be adapted to
in situ circuit boards.
Not all boards test alike due to the variety of dielectric types, thickness and board layouts. All
insulated metal substrates closely resemble a parallel plate capacitor during high voltage testing.
The capacitance is equal to:
A
C =ε ,
d
where ε is permittivity (dielectric constant), A is surface area, and D is the distance (dielectric
thickness). The capacitance values change with different configurations of materials and board
layouts. Thus, when the actual capacitance changes enough to impact circuit functionality, the
board should be changed out.
Advantages:
The trending of leakage current is a good indicator of insulation quality. The aging of insulating
materials is typically exacerbated by increased temperature and corrosion. Rapid changes in
leakage current may indicate that corrosion effects are occurring. Likewise increased
temperature tends to increase leakage current.
By providing a periodic DC voltage on a segment of the circuit the location of a leakage problem
can be related to a specific circuit segment.
Disadvantages:
Insulation testing requires isolation of the circuit to measure ground current leakage for a specific
voltage, the voltage is generally set to a fixed value (e.g., 700 or 1200 volts), this is typically far
above the ratings of electronic components. Electronic component rating might be on the order
of 50 to 100 volts.
Typical applications require that the circuit be isolated and not in use. This results in circuit
down time for an active leakage current measurement.
For continuous operational measures very sensitive measurement circuits would be required to
determine the difference between the current used in the normal circuit and current leaking to
ground.
The process must be calibrated to each specific circuit to develop historical trends.
No direct correlation between insulation resistance reduction or leakage current increase and
circuit failure has been developed, therefore experience and judgment used.
4-14
Methods for Detecting I&C Board Failures
The theory for signal comparison is simply that if two measures are supposed to be the same and
they are not, then something has changed such as aging of a circuit component.
Signal comparison techniques have been applied in many nuclear power plants (IAEA 2000).
The setups for measuring have been both temporary and fixed. For I&C boards that perform
voting logic based on the input from three or more sensing circuits there is a possibility of using
a comparison process to identify deviations from the normal signal processing within the board
(Kim 2002). Such systems normally protect against false actuations from failures in a single
sensor leading to the voting system. A failure on the I&C board could lead to a false signal to
actuate a function such as reactor trip. The other failure of concern is failure to pass a valid trip
signal due to a failure of a board component. By comparing signals from one channel with
another, significant deviations can identify problems with components on the board up stream
from the sensor signal to the logic voter.
Advantages:
The comparison process provides a good indicator of a system change in one of the parallel
circuits. In the case of aging, trending of the circuit measures can be used to identify the
degraded circuit.
Monitoring and comparison of redundant systems can also identify miscalibrated circuits.
Changes in a noise signal can be used to identify aging effects5. The increase of noise on the
signal is an indicator of increased leakage to ground, capacitor shorting or inductor failure.
Monitoring signals can be taken from board inputs and intermediate outputs for passive
evaluation of the board condition.
Disadvantages:
Something other than aging in the measured circuit components could be causing the difference
in the signals such as changes in the ground connection circuit characteristics, induced currents
from electromagnetic radiation from near by circuits, cell phones, wireless networks, radio
waves, etc.
5
An internally generated noise signal can be found from the output signal by subtracting the average reading from
the instantaneous reading.
4-15
Methods for Detecting I&C Board Failures
Signal comparison is only good up to the voting circuitry; other means are needed to address the
voting logic and output components.
Signal comparison requires engineering evaluation and software programs to identify a degraded
circuit.
There is no correlation between signal comparison variance and the probability of failure.
Does not detect situation where the redundant circuits are affected by the same global stressor or
other common cause.
The theory for this method is that aging degradation within the system will cause electrical
“vibrations” like a mechanical system that is out of balance.
This method is derived from the approaches used to analyze the vibrations in machinery where
an indication of degradation can be detected by perturbations of the normal operating conditions
(Wowk 1991). The process can include both periodic and continuous measurements. There are
two major types of techniques in this method. These are passive measures of the circuit
parameters and of the environment around the circuit.
In this case the normal current to the circuit can be continuously measured. When changes in the
current flow are detected for a given operating state (e.g., increasing current indicates a ground
or short circuit, and decreasing current indicates degradation leading to an open circuit)
identification of precursors to circuit failure can be identified.
Moreover, if an internal noise signal is generated by circuit operation, and if it can be isolated
from process noise, its increase in intensity at various frequencies can be used as a passive
indicator of component aging trends. The process for creating a circuit noise signal involves
filtering noise in the process signal and then calculating the difference between the average
signal and the instantaneous variations above and below the average signal. This is referred to as
a passive measure of noise from the circuit.
A mechanical analogy for passive circuit measures would be the increase in noise and vibration
from the aging of prestress cabling in a containment vessel. A method for continuously
monitoring large prestressed structures to detect wire failures has been developed (Elliott, 2000).
The technique involves placement of broadband accelerometers (sensors) on the surface of a
prestressed concrete structure. The sensors detect anomalous acoustic emissions from the
4-16
Methods for Detecting I&C Board Failures
structure, including those generated by the energy released by wire failures. An on-site data
acquisition system reviews data collected by the system and transmits relevant data to a
processing computer. Analytical software is used to classify and locate significant events. In
this manner, a complete record of wire break activity can be collected over the duration of the
monitoring program. Absence of wire break activity provides assurance of continuing structural
integrity.
The second technique is the use of environmental measures such as temperature, vibration, and
air quality (e.g., smell of insulation burning) near the board as a measure of the aging factors.
Then the impact of the measure is treated like the case of reliability modeling discussed in
method 2. If a damaging condition is identified then additional inspection is needed to assess the
need for a board repair or replacement.
Advantages:
There is no interference between the measuring process and circuits on the board itself. This
means that any failure mode is not introduced by additional complication from the monitoring
system, which is independent of the board itself.
Independent sensors can measure heat, vibration, etc. to detect change from normal operation.
May permit automation of the signal analysis using a process and system to convert measured
aging data and trends into a condition monitoring or operational assessment analysis.
Disadvantages:
There is a need for a correlation between a measurable parameter and the failure probability
(e.g., temperature of exhaust cooling air and the degree of degradation or aging that is expected).
4-17
Methods for Detecting I&C Board Failures
Correlations or models can be established to estimate the probability of circuit failure due to an
aging failure mode before the next test interval. There is an expense in correlating measured
environmental data with estimates of aging degradation. This introduces two uncertainties – one
in the environmental measure (e.g., temperature), and in the degree of degradation (e.g., reduced
life time at elevated temperature). The rapid end of life burnout failure for an incipient failure
might be more appropriately measured by a rate of change in the current increase, which is not
easy to collect as a database or to correlate mathematically. Hence, there is significant reliance
on human judgment as to replacing a board on this measure alone.
An interrogation test signal can be sent to a specific board to measure the I&C board transfer
function. It can be as simple as an externally generated pulse to check continuity and timing, or
it can be an internally generated signal within the I&C board. Circuits on the board could
intentionally generate internally generated signals, or be the product of the noise introduced into
the signal processed by the circuit board. Boards with analog to digital transfers may tend to
eliminate ability to monitor internally generated noise signals.
The use of interrogation signals driven by software command is a common diagnostic technique
applied throughout the computer industry (e.g., personnel computers, servers, web sites, and
software programs) to test hardware such as memory locations, hard drives, and disk drives for
failure and software of an operating system for unwanted changes to a code or information in a
database. All that is required is to run a program that sends test signals to the appropriate location
and evaluates the results. This technique could possibly be adapted to specific circuit boards.
4-18
Methods for Detecting I&C Board Failures
The key indication from a current measuring device in an I&C board would be a slight increase
or decrease in the current flow in a circuit. If the current flow increases or decreases to a critical
point, a warning signal could be triggered. The indications and trigger points for action could be
based on a significant current deviation from an initial allocation analysis.
Signature Analysis
The warning from an interrogation pulse could be as simple as a comparison of voltage level of
the return pulse with a standard return pulse. A warning signal could be triggered if the return
pulse falls out side the pulse height, duration or response time specifications when new, or when
a trend shows that; for example, the pulse height is trending downward and pulse duration is
increasing. Evaluation of such trends is referred to as signature analysis (e.g., Wowk 1991).
Once the circuit precursor failure condition is identified, further investigation of the boards in the
circuit can be performed.
An example of this technique that has been used for a long time period in nuclear plants (but not
in circuit boards) is the examination of steam generator tubes for changes in wall thickness using
an eddy current probe. The probe produces a signal at a given frequency which when linked
magnetically with the tube induces perturbations in the measurement circuit when flaws are
encountered near the probe (EPRI 1992). The methods used can include simple signals up to
very complex signal analysis.
Pattern Recognition
For more complex integration signals, such as existing white noise or a pseudo random input
signal, the result can be displayed in the frequency domain using a spectrum analyzer to view
amplitude and phase angle versus frequency. The analysis of the frequency response spectrum
can decompose the response signal, into its frequency components, so that it is possible to
evaluate the aging impact on specific components within the circuit or I&C board that impacts
that frequency range. Aging induced changes in capacitors and transformers on the boards will
shift frequency peaks and valleys in the frequency spectrum when filtered through the system
response characteristics for typical capacitor circuits as shown in EPRI 2002 and EPRI 2003.
Degradation warnings can be given when the frequency display pattern differs from a standard
pattern for the I&C board or the amplitude at a frequency becomes too low or high. This
approach is referred to as pattern recognition (Fukunaga 1990). It has found many successful
applications in the measurement and correction of vibrations in mechanical systems (Wowk
1991). The same principles could possibly be applied to circuit boards.
Frequency Analysis
Active frequency analysis uses the amplitude and phase angle versus frequency response to an
input signal (Uhrig 1970). The input signals can be white noise, pseudo random noise, or pulses.
Ideal white noise includes equal amplitude inputs for all frequencies. Pseudo-random noise can
be generated using a series of pulses with various durations to represent a white noise signal. In
this method models of the physical parameters of the device being tested are compared with the
4-19
Methods for Detecting I&C Board Failures
peaks in the frequency profile to identify causes of the peaks, causes of changes in the peak,
changes in the spread of the frequency peak or valley. In this way specific components on the
board within the circuit that need replacement can possibly be identified prior to board removal.
This technique is similar to the others in this method except that the development of the test
signal and analysis of the results is far more complex. With the success of the other active
measurement techniques and software development, this technique could be adapted to circuit
board analysis.
Advantages
The trend in circuit degradation can be monitored on a continuous basis or queried manually by
an analyst.
Under the correct setup of hardware and software an analyst can query the data during operation.
The monitoring process can be set up for automated warnings using software and preset warning
points.
As experience is gained the patterns can be catalogued for future use, which enhances the ability
to diagnosis specific component failure modes.
May permit automation of the signal analysis using a process and system to convert measured
aging data and trends into a condition monitoring or operational assessment analysis.
Correlations or models need to be established to estimate the probability of circuit failure due to
an aging failure mode before the next test interval to support probability estimates.
Disadvantages
This method requires an external diagnostic signal interaction with an operating circuit when
using pulsing or pseudo random interrogation signals to measure the circuit transfer function.
Care must be exercised to ensure that the interrogation signal does not interfere with the normal
function of the I&C board.
The use of an internally generated noise signal for monitoring requires an engineering model of
the circuit suitable for comparison and trending to fully interpret the aging induced changes in
circuit operation.
4-20
Methods for Detecting I&C Board Failures
As can be seen from the tables very few advanced techniques have been demonstrated for
detecting precursors of failure. However, all techniques except for visual inspection and
reliability modeling are likely to directly detect a failed circuit when applied correctly.
This technical review assumes the following process for addressing a circuit board precursor of
failure.
1. Appropriate resources are applied to make the detection system workable for detecting
anomalies and identifying failure trends at a circuit level.
2. Upon detection of a circuit anomaly by measures at the circuit level, additional investigation
would be applied to pin point the board with a failing component.
3. It is assumed that successful repair of a circuit can be performed while the plant is operating
(e.g., in control circuits the feedback loop can be replaced by a manual adjustment while the
circuit is under repair, in protection circuits, a 2 of 3 redundant signal arrangement can be
temporarily replaced by a 2 of 2 or 1 of 2).
4. An individual component failure on a circuit board can be pin pointed via bench testing, and
a decision about repair of the component on the board or replacement of the board can be
made.
5. Circuit restoration can be performed on line by replacing the board with a new board or with
a repaired original board.
4-21
Methods for Detecting I&C Board Failures
Table 4-3
Review of the Potential for Techniques to Identify Precursors of Aging Failure in Electronic Components
Method 1 1 2 3 3 4 5 5 6 6 6 6
Signal Passive Passive Active Active Active Active
Periodic Periodic Reliability Measures of Measures of
comparison measurement measurement measurement measurement measurement measurement
inspections inspections Modeling resistance resistance
Description measures systems systems systems systems systems systems
Detection Circuit Environmental Pattern
measurement Functional Visual MTBF Resistance Leakage
Auto sampling parameter parameter
Signal change Signature
recognition
Frequency
testing Inspections calculation testing current testing testing analysis testing analysis testing
technique testing testing testing
Difference in
Resistance Leakage Changes in Changes in Measures of Evaluation of Trend analysis
Detects a Statistics response Comparison to
Observes change to current passive environmental change to a test signal input of spectral
circuit from similar between a spectral
card damage ground in entire increase in measures of conditions of simple current or to output response to a
Precursor malfunction components redundant standard
circuit entire circuit circuit outputs circuit card voltage pulse changes complex signal
condition circuits
4-22
Methods for Detecting I&C Board Failures
Table 4-4
Review of the Potential for Techniques to Identify Precursors of Aging Failure in Integrated Circuits and Chips
Method 1 1 2 3 3 4 5 5 6 6 6 6
Signal Passive Passive Active Active Active Active
Periodic Periodic Reliability Measures of Measures of
comparison measurement measurement measurement measurement measurement measurement
inspections inspections Modeling resistance resistance
Description measures systems systems systems systems systems systems
Detection Circuit Environmental Pattern
Functional Visual MTBF Resistance Leakage current Signal change Signature Frequency
measurement Auto sampling parameter parameter recognition
testing Inspections calculation testing testing testing analysis testing analysis testing
technique testing testing testing
Difference in
Resistance Changes in Changes in Measures of Evaluation of Trend analysis
Statistics from Leakage current response Comparison to
Detects a circuit Observes card change to passive environmental change to a test signal input of spectral
similar increase in between a spectral
Precursor malfunction damage ground in entire measures of conditions of simple current to output response to
components entire circuit redundant standard
circuit circuit outputs circuit card or voltage pulse changes complex signal
condition circuits
Unstable supply pin
Weak supply current,
loss of memory @ u n n u u u n n y1 y1 y1 y1
loss of R, L or C)
FF…
Supply current is too
Memory cannot be
high, failure of juction u n n u u u n n y1 y1 y1 y1
reburned
boundary
Leakage current of an Checksum is does
output transistor is out of not agree with u n n u u u n n y1 y1 y1 y1
spec. standard
High temperature Gates have
u n n u u y1 n n u y1 y1 y1
operation unstable output
Corrosion induced
Output voltage of a
intermetallic layers u n n u u y1 n n u y1 u y1
gate oscillates
growth at interfaces
4-23
Methods for Detecting I&C Board Failures
Table 4-5
Review of the Potential for Techniques to Identify Precursors of Aging Failure on Circuit Boards
Method 1 1 2 3 3 4 5 5 6 6 6 6
Leakage Circuit Environmenta Signature Pattern Frequency
Functional Visual MTBF Resistance Auto Signal change
Detection measurement current parameter l parameter analysis recognition analysis
testing Inspections calculation testing sampling testing
technique testing testing testing testing testing testing
Trend
Difference in Changes in Measures Evaluation of
Resistance Leakage Changes in analysis of
Detects a Observes Statistics response passive change to a test signal Comparison
change to current environmenta spectral
circuit card from similar between measures of simple current input to to a spectral
ground in increase in l conditions of response to a
malfunction damage components redundant circuit or voltage output standard
entire circuit entire circuit circuit card complex
circuits outputs pulse changes
Precursor condition signal
Resistance increase or
Cracked coatings n n n u u y1 u n u u u y1
decrease
Seuarated or bowed boards,
Resistance change n n n u u y1 u n y1 y1 y1 y1
cleaning urocess faults
Excessive dust or pollution on
Resistance fluctuation n n n u u y1 u n y1 y1 y1 y1
board and components
4-24
5
SELECTING A METHOD FOR MONITORING CIRCUIT
BOARD AGING
This section describes the framework element process presented in Section 1 for selecting
methods to detect specific aging failure modes in circuit boards. Such decision-making
processes are applied not when you know exactly what to do, but when you don’t know what to
do. The process of balancing conflicting issues and areas of uncertainty in selecting a path to
pursue is the point of true decision-making. This section addresses decision making in the
framework for systematic management of aging circuit boards in power plant applications by
defining key decision elements, discussing the elements, and putting them into logic pathways
that lead to selection of a method or technique that can meet desired circuit monitoring
objectives.
The emergence of new electronic tools for detecting circuit changes presents an opportunity to
upgrade the present approaches for managing aging of electronic circuit boards. The twelve
detection techniques discussed in Section 4 are listed in Table 5-1.
Table 5-1
Techniques for Detecting Precursors and Progression of Aging Failures in Circuits
5-1
Selecting a Method for Monitoring Circuit Board Aging
Each method provides a unique basis for upgrading the detection of aging conditions in
electronic circuits. Also, the improved detection systems can provide more precise predictions.
Development of a relationship between the measurable parameters and probability of failure is
needed to support more quantitative condition monitoring and operational assessment processes,
and the potential for automated aging monitoring.
To establish decision based concepts and questions for selecting improved techniques, a set of
key considerations for the decision need to be established. The primary question to answer is
“Should an improved process be used to predict circuit board failure because of aging?” To
answer this question the following set of considerations was developed to support the decision-
making process.
• What is the importance of a specific circuit containing an I&C board to plant operation?
• Are precursors to failure on the circuit board, due to component aging, easily observable
under existing operational conditions?
• Can use of a new detection method technically improve on existing approaches for
monitoring board-aging trends?
• Does the existing detection method support evaluation of failure probability to use in setting
next inspection or repair interval (e.g., condition monitoring to assess the existing remaining
life and operational assessment to evaluate failure probability over the next operational
period)?
• Does the existing method support a workable process for repairing or replacing boards?
Existing methods generally use periodic testing and inspections to identify aging failures. The
circuit test interval is usually based on the refueling outage schedule. The interval between tests
has been changed based on evaluations of the MTBF using reliability models. This is a
combination of methods 1 and 2 listed in Table 5-1. This combination is clearly acceptable for
most circuits during the operating period. The use of reliability modeling to establish a better
test interval is a first step in developing an improved method for monitoring and predicting aging
failures.
In order to improve the prediction of aging induced failures on circuit boards, it is necessary to
use models of aging and the measurement of failure mode precursors to characterize the
condition of the circuit board. Evaluation of the measurements to predict failure requires some
type of model or correlation. Models of the failure probability given measures can include
engineering judgments, statistical evaluations, POF models, trending of measures, reliability
models supported by the measured variables, combinations of trending and correlations of
measures, etc. An objective of the modeling is to produce a probability of failure estimate for
5-2
Selecting a Method for Monitoring Circuit Board Aging
the past interval between tests and a projection into the next test interval. These processes are
called condition monitoring and operational assessments. They permit a quantitative evaluation
of the probability of circuit board failure based on specific performance measures of the circuit
board and components.
Selecting an improved measurement process for identifying and repairing circuit boards that
suffer from aging failures requires a decision process. Considerations in such decisions should
include: (1) an assessment of the importance of I&C board to plant operation, (2) an assessment
of the ability to observe the impact of component aging failure modes on the circuit, (3) an
assessment of technical viability of method to improve on contemporary approaches, (4) its
ability to evaluate the existing condition6 based on objective measures, and (5) support for
replacing boards on a technical basis, which might use a correlation database. These decision
process elements are discussed below.
What is the importance of a specific circuit containing an I&C board to plant operation?
Determining the importance of a circuit is the first step in the decision tree framework. The
consideration of which monitoring method to use assumes that the importance of the circuit to
plant operation and safety can be determined. Several approaches have been developed for
assessing the importance of Structures Systems or Components (SSCs). Two approaches are
summarized here as examples, maintenance rule (NRC 2000) questions and the AP-913
approach (INPO 2003).
The first importance assessment approach is derived from language in the maintenance rule
(NRC 2000). The maintenance rule classification system includes five class boundaries (Q1 to
Q5), which can be assessed as yes /no conditions for each system. The supporting circuits and
circuit boards should have the same classification. Based on the I&C board importance class, a
method for measurement and evaluation of aging failures can be selected. Table 5-2 provides a
listing of questions where a yes/no helps identify circuits that may be considered important to
plant operation.
6
This supports condition monitoring in assessing the failure probability and the associated uncertainty, as well as
operational assessment to evaluate the probability of failure before the next test.
5-3
Selecting a Method for Monitoring Circuit Board Aging
Table 5-2
Maintenance Rule Questions to Assess Circuit Importance
Class Structures Systems or Components (SSCs) Typical Circuit Failure Logic Models in
Probabilistic Safety Assessments (PSAs)
Q1 Is the circuit a safety -related structure, system, Circuit functions modeled in logic combining
or component? basic events in PSAs
Q2 Is the circuit a non-safety related SSC that is Circuit failure impacts modeled as part of
relied upon to mitigate accidents or transients? SSC failure rate
Q3 Is the circuit a non-safety SSC that is used in Circuit failure impacts modeled as part of
plant emergency operating procedures (EOPs)? Human Reliability Assessment (HRA) failure
probability
Q4 Is circuit a non-safety related SSC whose failure Circuit failure impacts modeled as part of
could prevent safety-related SSCs from fulfilling SSC failure rate in PSAs
their safety related function?
Q5 Is the circuit a non-safety related SSC whose Circuit failure impacts modeled by grouping
failure could cause a reactor scram or actuation into statistical estimates of trip, not enough
of a safety related system? detail for aging impact.
The second method discussed in INPO 2003 uses classification elements to define SSCs as
critical, non-critical and run to failure. For circuits whose failure has no impact on the plant, no
special monitoring is needed (i.e., run to failure). For circuits considered to be non-critical
monitoring methods 1 and 2 can be used as currently employed in many plants. For critical
circuits improved monitoring of circuits can be considered.
Critical Circuits
With the maintenance rule as a base, a refined method for classification of critical SSCs has been
provided by INPO 2003. The classification process identifies and assesses SSCs that are
associated with the performance of specific function. Active and passive elements of each SSC
are considered.
If a failure of the SSC defeats or degrades an important function or a function that is redundant to
an important function, then it is a critical component, and analysis should be continued. In
general, if a component failure within a circuit prevents the performance of an emergency
operating procedure, or prevents the mitigation of the consequences of accidents that could result
in potential off-site exposure in excess of 10CFR100 limits; or requires an operator workaround
to prevent any of the classification elements in Table 5-3 from performing any of the above
functions or procedures, then the circuit it is considered a critical circuit.
5-4
Selecting a Method for Monitoring Circuit Board Aging
Table 5-3
Classification of Circuit Types Following AP 913 (INPO 2003) Elements
Failure to control a critical safety function (e.g., Control or Revealed component failure
reactor water level and pressure) protection circuits
Degraded capability to shut down the reactor Control or Unrevealed component failure
and maintain it in a shutdown condition protection circuits discovered during testing
Run-to-Failure Circuits
A run-to-failure circuit is one for which the risks and consequences of failure are acceptable
without any predictive or repetitive maintenance being performed and there is not a simple, cost-
effective method to extend the useful life of the component. The circuit should be run until
corrective maintenance is required.
Non-Critical Circuits
In between the categories of critical and run-to-failure, there are a number of SSCs for which
cost-effective preventive maintenance makes sense. If failure of the SSC results in any of the
conditions listed below, then a non-critical analysis should be performed. Otherwise, the circuit
can be classified as run to failure.
1. Circuit failure creates an unacceptable increase in personnel, industrial, environmental or
radiological safety hazard (e.g., drift of radiation monitors).
2. The circuit has a history of unacceptably high repair, replacement, or operational cost.
3. Circuit failure represents an operator or maintenance burden (e.g., operator manually
operates the feedback control input).
4. The circuit is obsolete, in short supply, or very expensive to repair or replace.
5. There is a long lead-time for replacement parts, which prevents a required circuit from being
repaired in a timely fashion.
5-5
Selecting a Method for Monitoring Circuit Board Aging
6. The circuit operation is necessary for work on critical equipment (for example, containment
isolation control).
7. Circuit failure promotes failure of other components (e.g., failure of control circuit causes
over torque on valve stem).
8. There is a potential for new risks from hazardous chemicals or environmental concerns (e.g.,
spurious operation of drain valve on a storage tank containing radioactive material).
9. Failure results in a power transient, sustained generation loss or reduction in the necessary
redundancy or defense-in-depth (e.g., primary control systems for reactor power).
10. Failure may lead to regulatory consequences.
11. Circuit failure will hamper or prevent timely repair of a critical component.
12. It is more cost-effective to maintain the circuit, as opposed to repair or replacement.
Are precursors to failure on the circuit board, due to component aging, and easily
observable under existing operational conditions?
The differences between control circuits and protection circuits have an impact on which method
might provide the most precise information for monitoring the aging processes. Specific types of
circuits may be better suited for one type of diagnostic procedure than another. In some control
circuits, changes in the way the plant operates can be detected by operator observation. In this
case operator observation could initiate a circuit inspection to correct for unusual behavior of the
plant responses. For important control and plant protection circuits some advanced methods may
be useful in developing a more board specific program that would help avoid the potential for
increased trip and power reductions. To determine the observability of aging in circuits the
following sections address the types of circuits where circuit boards containing electronic
components are found.
Control Circuits
The control circuits for continuous feedback systems are either analog or digital. Most
continuous controllers can be built using analog electronics. Figure 5-1 for analog control
circuits shows typical control circuit elements with various I&C boards at points where signal
comparison adjustment conditioning, or combinations can be found. Feedback circuits that
contain older relay logic circuits can be replaced by software logic to perform the same function
(EPRI 2003). This clearly introduces a new set of important failure modes for control circuits
such as electromagnetic interference from cell phones etc., silent failures and physical fail safe
not possible conditions (DiSandro and Torok 2004).
5-6
Selecting a Method for Monitoring Circuit Board Aging
Manual
local Signal
conditioner
Plant computer
A/D converter
Plant computer network
Monitor
General circuit board
Figure 5-1
Example Analog Control Circuit With Circuit Boards
In Figure 5-1 the electronic elements shown can be on a single board or be distributed on several
boards. In many plants the control circuits are isolated by analog to digital converters. The
digital outputs are generally suitable for communication on computer networks. The signal
drawn from an analog control circuit is sent to a central processing computer that provides
monitoring signals to the control room and elsewhere.
Originally, US nuclear plants used analog circuits that were not computer controlled, however
most new process control systems are now digitally controlled using computer systems. Some
specific circuits have been converted to digital control systems, but have limited computer
interface for controlling the plant. Computers are used mainly for monitoring functions. Figure
5-2 provides an example of how an analog control circuit could be modified to become a digital
system. In this case a digital controller that performs the same control task as the continuous
analog controller can replace the analog controller feedback loop. The basic difference between
an analog and digital controller is that the digital system operates on samples of the sensed signal
rather than on continuous signals. Recently many control systems in nuclear plants have been
converted to digital control. The control interface is typically through dedicated wiring rather
than through a computer network. This has an impact on how an aging monitoring system could
be established, on the quality of the signals used for monitoring, and on the frequency range of
the signal passed from the analog to the digital system.
5-7
Selecting a Method for Monitoring Circuit Board Aging
Manual
Local A/D converter Signal
conditioner
Clock
Plant computer
A/D converter
Monitor
Figure 5-2
Example Digital Upgrade Control Circuit
Protection Circuits
Protection circuits are characterized by redundant measures and voting logic to prevent one false
signal from triggering a safety action. They can be either signal interruption or signal sending
circuits. A unique feature of protection circuits is that for a portion of the circuit there are
redundant paths for redundant circuits. This feature allows for the possibility of comparing
signals in the redundant pathways as a means of detecting both internal circuit aging and changes
in the plant or sensors that are reflected as differences in the signals in the redundant trains.
Thus, from a monitoring viewpoint, redundant circuit boards can be compared to identify
differences in the associated circuit board measures of voltage and current as a measure of aging.
Figure 5-3 shows the key elements of a protection circuit. The lower portion shows an example
circuit train with identified integrated circuits that could be used in a digital voting system (Kim
2002). The elements can be on one circuit board or distributed to locations along the circuit
pathway.
5-8
Selecting a Method for Monitoring Circuit Board Aging
Sensor/
transformer
Sensor/
transformer
Signal Signal
conditioner conditioner
Functions may
be on one circuit
board
Signal
conditioner
Voting circuit
Trip Signal
conditioner
signal
Figure 5-3
Example Plant Protection Logic Circuit With Train Components
Thus, knowledge of the circuit details is needed to determine the feasibility for a monitoring
upgrade, i.e., which elements of a circuit can be monitored and observed, and which method or
technique could be applied. As in the case of control circuit inputs the older relay logic circuits
used in many protection systems can be replaced by software logic to perform the same function
(EPRI 2003). This clearly introduces a new set of important failure modes for protection circuits
such as silent failures and physical fail safe not possible conditions (DiSandro and Torok 2004).
5-9
Selecting a Method for Monitoring Circuit Board Aging
Can use of a new detection method technically improve on existing approaches for
monitoring board-aging trends?
This question is made more specific in the logic trees to answer whether aging failure modes of
components are detectable by existing test methods (i.e., 1 & 2). If the answer is no, then Table
5-4 can be used to provide a link between existing and advanced methods of detection and the
approaches for monitoring the aging impacts.
Table 5-4
Relationship Between the Detection Method, Technique and Monitoring Category
4 6 Signal comparison measures Predicts failure based on signal differences in redundant portion of circuits
Passive measurement
5 Monitor for change in circuit characteristics to predict aging
systems
Passive parameter Predicts failure based on passive measures of circuit outputs such as current
5 7
monitoring and voltage
Environmental
5 8 Predicts failure based on environmental conditions of circuit card
parameter testing
Monitoring for detection of aging at circuit level (Change in output indicates
6 Active measurement systems
failure of capacitors and transformers)
Predicts failure of circuits based on measures of change to a simple current
6 9 Signal change testing
or voltage pulse
Signature analysis
6 10 Predicts failure based on evaluation of test signal input output characteristic
testing
Pattern recognition Predicts failure based on comparison with a spectral standard for the specific
6 11
testing circuit or card
Frequency analysis Predicts failure based on analysis of the spectral response to an active
6 12
testing signal
5-10
Selecting a Method for Monitoring Circuit Board Aging
The technical capability to address various aging failure modes was discussed in Section 4. In
general, as the measuring process becomes more sophisticated the monitoring results can become
more precise in predicting both the timing and accuracy of a result such as the MTBF.
Detection Methods
The first issue to address for this decision element is whether a method can detect the aging
failure modes associated with the components (e.g., resistors, transformers and capacitors) on the
board. For the category of resistors methods 3 to 6 can detect changes that would directly
identify aging failure modes. In the case of transformers methods 3 to 6 can be used, but require
a little finesse to detect gradual aging failures in methods 3 to 5, but can be more precise with
method 6. The most difficult component is a capacitor. Use of methods 3 to 5 will require a lot
of finesse on the technician’s part. Method 6 should be able to detect capacitor problems by
tracking the shift in frequency response of the overall circuit. The technical under pinning for
system response changes is discussed in EPRI 2004 Appendix A.
Monitoring Approaches
Any reasonable detection system will need to monitor the circuit as a whole. In the case of
control circuits a repeatable input test signal output response can be compared with previous test
results saved in a database. In principle as the circuit ages, trends can be detected before a
complete failure by comparing recent outputs with previous results. When using electronic
signals as in methods 3 to 6 the sampling frequency can be can be increased so the monitoring of
the aging could be considered almost continuous. This can be done when the test signal does not
interfere with the normal function of the circuit.
Even if the monitoring of the circuit is continuous, it will not detect rapid stressor induced
failures such as a high voltage spike, or high current induced fault. If however the stress impact
leaves the system operable with some degradation, the detection system should identify changes
in the circuit response by changes in the measurable factors.
The ability to detect, monitor and interpret the changes can be formally addressed using a
condition monitoring approach. This uses test and inspection results to determine if the circuit is
suffering from aging effects. The condition monitoring approach can be supported with
statistical correlation models, reliability models or by judgment of the plant personnel to relate
the changes in response output to the aging failure mode and associated circuit board and
component on the board.
Does the existing detection method support evaluation of failure probability to use in
setting the next inspection interval?
Improved circuit aging evaluations require clarity of concepts to become more precise in
predicting failure based on the measurement of precursor and progressive failure conditions. The
5-11
Selecting a Method for Monitoring Circuit Board Aging
definitions developed here can support probabilistic formulations that will permit the evaluation
of a failure probability and its uncertainty bounds.
• Condition monitoring – A process for evaluating the existing condition (e.g., probability of
failure) of components or systems based upon measures of operating performance.
• Operational assessment – A process for predicting the probability of failure over a future
time period based on measures of operating performance.
The important issue here is to understand how the measures that are collected and saved can be
converted into a probability of failure that can be used to decide what to do with the circuit board
and component that is suffering from aging indications. To answer the question a process needs
to be defined that relates the measured parameter(s) to a failure probability function. This is
where the existing reliability and POF models can provide a good starting point. If the detection
measure can be related to a failure rate, then reliability and POF models provide a
straightforward conversion to probability. In these cases the uncertainty in the measurement
parameter and conversion to failure rate can be used to evaluate the confidence level associated
with failure probability. The models can be set up for Monte Carlo simulations that can extend
the measured trends to provide the probability of failure and justification for the replacement
schedule based on either or both condition monitoring and operational assessment.
Statistical Correlations
If the measures and models are not compatible with the level of measurement in a method, then a
statistical approach can be used. In this case tests of circuit failure versus a measured parameter,
or combinations of parameters can be used to construct a correlation with statistical uncertainty
bounds. Then trends in the measured parameters can be compared with the correlation to
estimate the aging induced failure potential. The comparison can be performed graphically to
find best estimate values, or simulations can be used to establish the statistical confidence
bounds. The correlations can be set up for Monte Carlo simulations that can bridge between the
measured trends and the correlation to provide the probability of failure and justification for the
replacement schedule. Correlations that address both condition monitoring and operational
assessment requirements can be established.
Synopsis
Once data from the detection system is gathered and stored, at least two kinds of formal analysis
can be used to evaluate the probability of failure, as it exists in the most recent test, and then
predict failure prior to the next test. Such analyses use existing models and statistical
correlations that are set up to perform either condition monitoring to assess the existing
remaining life and operational assessment to evaluate failure probability over the next
operational period.
5-12
Selecting a Method for Monitoring Circuit Board Aging
Is the combination of circuit type and method for repair or replacement non–intrusive to
plant operation?
In this decision element aging identification is assumed to be performed at the total circuit level.
Once circuit degradation or failure is identified then a search for the specific board within the
circuit begins. The most limiting case is when the plant must be shutdown to repair the circuit.
In the least limiting case the electronic feedback circuit is replaced by a manual input while the
circuit is being repaired. In this case there is no plant outage required, but operators must be
extra vigilant and act as the feedback circuit.
Because existing diagnostic tools are imprecise in identifying the specific failed component, a
well-known process is to replace boards in the circuit one at a time until the circuit operation is
fully restored. Then degraded or failed components on the replaced boards can be identified
through inspection and bench testing. Thus, the existing process is assumed to be that once a
problem is identified at the circuit level, trouble shooting is performed to identify the specific
board, the board is replaced and sent to a lab where the failed component can be identified and
repaired, if it is deemed cost effective.
In some cases a series of boards might be replaced until the circuit is restored to full operation.
Often good components are replaced during this diagnostic procedure when the identification of
the circuit problem is imprecise and all the boards in the circuit are replaced. If the defective
board among the good boards can be identified through testing and with a little finesse the
degraded components are pin pointed, they can be replaced and the circuit board restored and
placed in standby for future use. Because this process is labor intensive and diagnostic tools are
not precise in identifying degraded circuits many utilities just replace the defective boards. This
is generally an accepted practice as long as the boards are not obsolete.
The hope is that advanced measurement systems can more precisely identify the board or
component that is undergoing aging degradation. With a more precise identification of a
component degradation mode, it might be possible to isolate the degraded part of the circuit
without having to shut the plant down. This would prevent a trip or power reduction during the
repair process.
Through more precise diagnostic methods, the aging degradation on a specific board or
component could be identified. This would significantly reduce reliance on troubleshooting
finesse to correct degraded circuit operation. This type of improvement can be measured as time
saved in troubleshooting during which the plant is exposed to an increased potential for trip.
Thus, when assessing the impact of this decision element, the existing repair process can be
compared with the future process.
5-13
Selecting a Method for Monitoring Circuit Board Aging
The next section provides an integration of these decision elements into pathways. These
pathways help identify which method and approach can be applied for aging management in
electronic circuits. The most appropriate technique can then be associated with an upgrade for
aging monitoring in a specific circuit.
The decision process elements discussed above have been converted into decision mapping logic
to help utility plant personnel select a method for monitoring aging of I&C circuit boards. Since
slightly different methods apply to control circuits and plant protection circuits, two figures,
Figures 5-4 and 5-5 have been prepared. The difference is in the application of potential
methods for some of the logic pathways (F, G, and H). Moreover, the details of applying a
method can vary depending on whether the circuit is analog or digital, safety or control, or
software or signal controlled. The decision tree has been constructed with event elements so that
an integrated set of questions can be answered with yes or no depending on the current situation
for each circuit that is being considered for upgraded monitoring.
NO
B 2 reliability
YES
C 1&2 Rel. & Insp.
YES
YES
YES
D 3 simple meas.
NO
NO
NO E 3 or 5 passive meas.
YES
F 5 or 6 meas.
NO
YES
G 6 active meas.
NO
NO
H 6 active meas.
NO
I No monitoring
Figure 5-4
Control Circuits’ Progressive Aging Decision Tree
5-14
Selecting a Method for Monitoring Circuit Board Aging
NO
B 2 reliability
YES
C 1&2 Rel. & Insp.
YES
YES
YES
D 3 simple meas.
NO
NO
NO E 3 or 5 passive meas.
YES
F 4, 5 or 6 meas.
NO
YES
G 4 or 6 active meas.
NO
NO
H 4 or 6 active meas.
NO
I No monitoring
Figure 5-5
Protection Circuits’ Progressive Aging Decision Tree
A decision to become more sophisticated in the prediction of aging failures can be determined by
asking questions about each circuit identifying the associated logic path and then noting the
method recommended. There are several techniques within each monitoring method that can be
used to improve the detection of circuit aging.
Importance
The first decision element is to assess the importance of I&C board to plant operation. A
circuit’s critical importance to the plant can be incorporated into the identification of a
monitoring technique by applying the logic described in Figures 5-4 and 5-5. In general, “yes”
applies to circuits whose failure has a high potential for causing trip or safety system
unavailability. Other circuits that are classified as non-critical following AP-913 (INPO 2003)
can be placed on the “yes” path, if the circuits are important to operation or have a history of
causing repeated replacement problems. Such non-critical circuits lend themselves to potential
cost reductions in maintenance and repair activities.
5-15
Selecting a Method for Monitoring Circuit Board Aging
Methods for assessing the importance of SSCs can be extended to circuits. This will help identify
the priority of circuits that could be upgraded for condition monitoring and operational
assessments. Then circuit boards and components of a few circuits can be identified for
improved aging management. Processes have been developed for identifying the importance of
equipment in the plant by both industry INPO 2001 and NRC 2000, which have been
implemented to include electronic circuits by some utilities.
Observability
The second decision element is to determine if the existing ability to observe the impact of
component aging failure modes on circuit is adequate for operation and safety aspects of the
plant. In some cases the operators can observe circuit problems and switch to manual under the
existing observability of circuit degradation. This ability can vary widely depending on analog
or digital circuit types, location, environment, and board function such as control or protection.
Existing methods for noting circuit aging rely on observing failure in one circuit, which is
possible through functional testing and then extending this to similar circuits via reliability
modeling. Examples of measurement-based systems can be developed to help operators
recognize early indications of aging failure modes before they turn into full circuit failures.
The existence of observable precursors to an event can be identified by collecting and analyzing
events involving circuit failures that lead to spurious trips, system unavailability, and failure to
initiate a back up system. A listing of observable circuit features can be developed using
existing monitoring instruments, and tools for more precise and dependable observations.
Detectability
The third decision element is an assessment of technical ability of method to detect aging
conditions more effectively than the existing approach. In this case, does the proposed improved
method actually have the ability to detect component-aging failures? In this case Tables 4-3, 4-4
and 4-5 in Section 4 provide preliminary judgments about the technical capability of a technique
to detect various hardware aging failure modes. This is probably the most difficult question to
answer precisely, since all the detection systems will require some trending and judgment to pin
point the exact component with the age-related degradation.
To detect aging within a circuit, detection methods that can measure changes in typical circuit
boards and hardware such as chips and circuits (A/D converters, isolators, etc.) can be identified
and tested. One way to start is to survey utilities for applications already in use related to
methods 3 to 6. Some of these systems may have been applied in monitoring mechanical and
electrical components other than electronic circuits.
A next step is to setup and test measurement systems by integration of existing equipment for
specific application in methods 3 to 6. This involves integration of signal generation and
measurement hardware with software to build a system capable of measuring circuit changes.
The equipment should be able to generate test signals that do not interfere with the circuit
function, use networking software, capture and store data, evaluate trends, and generate
automatic warnings for aging issues. Integration of such components can be used to build
systems that apply to methods 3 to 6.
5-16
Selecting a Method for Monitoring Circuit Board Aging
Predictability
The fourth decision element is an assessment of the ability to improve on the accuracy of the
current failure prediction. This generally requires an evaluation of the failure probability due to
aging at a point in time. Two techniques have been proposed to support condition monitoring
and operational assessments. These are to use models developed to assess the reliability and
statistical correlations that relate a measured parameter to the likelihood of failure.
If circuit measures can be obtained stored and trended, the next step is to develop models and or
correlations that can be used to translate between the measured parameters and the aging impact.
Such models or correlations between the aging extent and a component failure have been used to
establish precise repair strategies. The ability to predict is applied statistically in method 2 where
as it is applied to specific circuits in methods 3 to 6.
Repairability
The fifth decision element is to determine if the selected method can improve the Repair process.
In the case of run to failure, the consequences could be an unexpected loss of the circuit, which
could trip the plant or cause unrevealed system unavailability. In these cases the plant might
have to be shutdown during the repair period. In the case of an aging detection system, the pre
warning and pin pointing of the circuit board(s) could support switching to manual circuit
operation during the replacement of the circuit boards. This would avoid a plant outage for
circuit repair. Any improved method should also improve the repair element.
A survey can be performed to list how utilities currently repair circuits on-line or wait until a
plant shutdown. This will provide a base line repair process and produce information for
developing improvements for measurement, monitoring and evaluation systems.
The decision process outlined above can be applied to any circuit (or circuit board) in the plant.
Once clear answers to the questions are established a method for addressing aging degradation
on the circuit board can be selected. The following example applications for resistors,
capacitors, and integrated circuits involve the basic components of any electrical circuit.
Resistance Components
Consider a circuit that has resistive components that are subject to aging failure, but plant
operators do not easily observe this. If the board is in a critical safety protection circuit, the
aging is not detectable using methods 1 and 2, and then the process of aging on the specific
circuit board is measurable by methods 3, 4, 5, and 6. At this point it is necessary to decide how
5-17
Selecting a Method for Monitoring Circuit Board Aging
important the circuit is for passive or active monitoring. Since the safety circuit is to be isolated
from other circuits a passive method might be required. This would point to methods 4 or 5. If
the board is in the redundant portion of the circuit, signal comparison can be an effective
method; otherwise method 5 could be applied.
Capacitor Components
Consider an important circuit in the control system that is subject to aging failure of capacitors.
If the failure mode is capacitor shorting, the circuit might exhibit intermittent faulty operation.
This is typically not detectable by circuit functionally (method 1) or reliability modeling (method
2). This leaves methods 3, 5, and 6. Since loss of capacitance is often difficult to detect, an
active measurement system should be considered. The impact of capacitor degradation usually
changes the shape and timing of pulses sent through the system, or shifts the frequency response
of the circuit. The simple pulse in method 3 could be used in conjunction with specialized
diagnostic interpretation equipment. The ability to measure and predict the failure increases with
methods 3, 5, and 6 because they use increasingly complex active inputs and output measures to
characterize the circuit.
Integrated Circuits
In the case of an integrated circuit the aging failure modes can be loss of the ability to hold
internal voltage levels due to increase in leakage current or loss of capacitance. Computer
diagnostic software for testing computer circuit boards would fall into method 3 where a fixed
set of pulses (1, 0) is sent to each input location and then a check sum on the output is compared
with a standard. Application of this method would require network access to the control circuit
board to monitor integrated circuit to verify that it holds its charge.
5-18
6
QUALITATIVE COST BENEFIT OF AGING
MONITORING TECHNIQUES
Purpose
The purpose of this section is to summarize what has been learned during this project about the
various monitoring techniques with respect to cost and benefits. The principle benefit is the
potential for using the techniques to measure circuit board aging conditions and to use the
measures to predict the probability of failure before the next set of measures are taken. Because
of the difficulty in predicting failure rates or the MTBF based on measurements, the qualitatively
assessed benefit of ability to detect an aging failure mode for most components can only be
considered on a relative basis at this time. The various techniques must be tested and evaluated
in greater detail before the benefits can be quantified.
Likewise, the costs cannot be fully assessed at this time; however, the relative cost (i. e., as more
than or less than another technique) can be qualitatively assessed. This has been done for the
three aspects of cost: for R&D, for plant implementation and for operation.
To provide guidance on the selection of possible techniques upon which to invest industry R&D
it is reasonable to assess the relative qualitative rankings with figures of merit. These figures of
merit are used to organize the techniques by their cost benefit ranking.
Considering how well each technique can monitor the range of aging induced degradation
indications identified for circuit boards and components is used to infer the relative technical
benefit for a technique. As shown in Tables 4-3 to 4-5 in Section 4, for each component failure
mode a qualitative review is provided of how well the technique could identify aging impact
degradation before circuit failure occurs. Four categories were used to classify each monitoring
technique, (1) unlikely to detect (n), (2) unknown (u), (3) likely to detect with no known
demonstration (y1), and (4) likely to detect with known demonstration (y2). A simple
formulation was used to calculate a figure of merit to represent the technical benefit. The
formulation is
β = α1 ⋅
T ∑ n +α ⋅ ∑ ui + α ⋅ ∑ y1i + α ⋅ ∑ y 2i ,
i 2 3 4
6-1
Qualitative Cost Benefit of Aging Monitoring Techniques
where βT is the figure of merit for technical capability, α1, α2, α3, α4 are the weighting factors for
each review, ‘i’ is the number of aging degradations addressed and n, u, y1 and y2 are defined in
Section 4 and shown in Tables 4-3, 4-4 and 4-5. The values of the weighting factors α1,
α2, α3, α4 were selected to be 0, 0.3, 0.8 and 1.0, respectively. These values are not absolute and
several different weighting factor schemes were tried and the result was that relative order of the
techniques was nearly the same. This set of rating factors appears reasonable based on the
following judgments:
• A detection technique that has been demonstrated is weighted as 1.0.
• A detection technique that is believed workable but not demonstrated is weighted less, say
0.8.
• A detection technique that is believed to be unknown as to how it might work is weighted
less, say 0.3.
• A detection technique that is believed to be unable to detect change is weighted as 0.
The review data and results for technical capability are provided in Table 6-1.
Table 6-1
Qualitative Review for Detecting Aging Degradation With Technique
Figure of
# Detection technique Σn Σu Σy1 Σy2 merit βT
3 Reliability Modeling 24 0 0 0 0
6-2
Qualitative Cost Benefit of Aging Monitoring Techniques
Relative Cost
The R&D category considers the costs needed to prove the principle of detection, interpretation
of measures, clarity of defining the aging mode, and defining the system elements. In general
these costs can be shared through industry R&D efforts.
The implementation category involves applying the system elements to a specific circuit within
the plant. It is assumed that the system would be permanently installed, although temporary
system set-ups could be used during the initial testing stages. The cost per circuit is expected to
be higher than the R&D costs on a relative basis because of the assumption that the R&D costs
can be shared across the industry.
The operational costs involve the time spent in using the technique, in trouble shooting the issues
revealed, and in repairing the circuits. This would also include costs for spare parts and training.
Based on experience the cost per circuit is expected to be lower than both the R&D and much
lower than the implementation costs.
The relative costs for a specific technique are inferred by assuming that a typical circuit is
selected based on its importance and the existing testing uses periodic functional testing. To
upgrade testing to address aging degradation monitoring the relative amount of R&D effort is
judged using qualitative descriptions of low = 1, medium = 2, medium high = 3, high = 4, and
highest = 5. The same qualitative descriptions are used in each cost category. As the systems
become more complex, greater R&D effort is required; however the complex systems are
expected to reduce the operational cost. In the case of implementation, if the technique can use
existing test circuits, then the cost is low, and if new systems need to be installed then the cost is
high. The result of this qualitative review is shown in Table 6-2.
The figure of merit for cost is the simply the sum of weighted rankings in each area. The
weighting is 1 for operation, 2 for R&D and 3 for implementation. This reflects the above
explanation that the per circuit cost is judged to be highest for implementation, less for R&D,
and lowest for operation after implementation is complete. The experience in some I&C upgrade
projects has been a higher than expected cost for implementation. Implementation includes
preparation work such as design, documentation, testing, training, installation and
commissioning (e.g., Walling D. interview by R Michal 2005).
6-3
Qualitative Cost Benefit of Aging Monitoring Techniques
Table 6-2
Qualitative Review of Cost for a Technique to Detect Aging Degradation
1 Functional testing 1 1 2 7
2 Visual Inspections 1 1 3 8
3 Reliability Modeling 1 3 1 12
4 Resistance testing 2 2 2 12
Table 6-2 summarizes the preliminary qualitative assessments that have been made given the
current knowledge available to the authors. Specific assumptions had to be made about each
technique. For examples,
• It was assumed that if a technique was selected, it would be pursued with R&D, implemented
and operated on its own.
• Sharing of R&D results between techniques was not assumed. However, it is likely that
incremental R&D would build improvements from the basic concept with the simplest
system to advanced systems with more features. Therefore, in actual R&D costs for
advanced techniques could be reduced by starting with simpler techniques and then
expanding the technology to more advanced features.
• It is expected that there is considerable uncertainty in the estimates, because engineering
evaluations of where and how each technique could be implemented have not yet been
performed. This type of assessment would need to be performed before a more quantitative
cost assessment can even be performed. Therefore, at this point the authors have only
developed costs as a figure of merit.
6-4
Qualitative Cost Benefit of Aging Monitoring Techniques
• It was assumed that each technique can provide a valuable improvement over what is done
now.
• It is assumed that utilities performing classification of circuits, relative to their importance to
plant operation and safety, will need to pick one or more of the techniques for testing and
monitoring for aging and random failures.
• It was assumed that the implementation cost is a one time cost, if the technique results in a
fixed system.
• It was assumed that the operations costs are on-going, and if the technique is not a fixed
system, then the setup becomes part of the on-going operations costs since the system
elements of the technique must be re-established each time it is used.
From Table 6-2 it can be seen that there are techniques where higher R&D costs can be offset
with potentially lower operating costs. However, before the costs of implementing any of the
techniques can be justified, proof of the concept and suitability from a regulatory viewpoint
needs to be demonstrated. Therefore, it seems premature to attempt to provide any more
quantitative breakdown of the costs of each technique at this time. As more information
becomes available it is clear that cost benefit evaluations will need to be performed for each of
the cost categories. For example, at some point the R&D cost and the benefit will need to be
evaluated for each of the techniques. This would permit the allocation of R&D assets to those
techniques that appear to offer the best payback for R&D expenditures. Similar cost benefit
evaluations will also need to be performed for implementation costs and for operational costs. It
is likely that the industry might agree to proceed with R&D expenditures for several techniques
since not all utilities would necessarily agree on the same technique for implementation. This
could be followed by cost sharing on implementation issues by smaller utility groups that favor
certain techniques.
Furthermore, once the initial engineering evaluations of each technique are done specific cost
benefit quantifications should consider one time implementation costs for fixed systems versus
implementation for portable systems.
Finally, a cost benefit analysis for operational aspects can then be provided once test systems are
have been demonstrated and are being used.
The initial qualitative costs depicted in Table 6-2 are only used to estimate what might be
expected for each technique in the areas of R&D, implementation and operation.
The objective of this comparison of relative technical performance with the relative cost is to
provide information to guide the allocation of limited R&D funds to those systems and
techniques with the potential to best address the objective of monitoring aging in circuit boards
and using the results to assess the probability of failure before the next test or inspection cycle.
The purpose of this initial qualitative cost benefit exercise is to provide initial thinking to support
utilities in making decisions as to which techniques might offer better solutions based on the
knowledge that is available at this time.
6-5
Qualitative Cost Benefit of Aging Monitoring Techniques
Figure 6-1 provides the figure of merit for cost versus the technical figure of merit. The
following interesting points are revealed about each technique.
Reliability Modeling
Reliability modeling by itself does not detect circuit specific aging degradation; however, the
models might prove important in interpreting the results of trending measures in other
techniques.
Visual Inspections
Visual inspections are somewhat costly for a relative low potential for being able to monitor
degradation. While visual inspections can identify some types of degradation, the ability to trend
on subjective measures is very difficult.
In the case of environmental parameter testing the R&D costs are high because indirect measures
are used to infer aging degradation. This will require significant verification and development of
double correlations to trend for aging failure modes. Furthermore, only a few of the component
failure modes would be detectable. Its best advantage is that it has no interaction with a circuit,
and is thus a good candidate for monitoring safety circuits, which are typically isolated from
monitoring systems.
Functional Testing
Functional testing is the primary existing method for verifying that circuits are operational
during periodic tests. It is possible to expand the measures during functional testing to better
monitor the aging degradation; however the technical and cost evaluations here do not consider
use of the more advanced monitoring methods. The existing test circuits can be applied to the
advanced approaches described below.
Leakage current and resistance testing have essentially the same ranking. Both would require
some R&D and implementation to go beyond periodic tests using temporary setups in order to
become a fixed system that would provide better trending results. A weakness in these
techniques is that the number of aging degradation issues that can be addressed is limited and
almost no capacitor failures can be detected.
6-6
Qualitative Cost Benefit of Aging Monitoring Techniques
The passive parameter monitoring technique appears in the high range of the cost benefit review.
It has a good potential for monitoring of safety circuits, but still is limited to the signals from the
circuit or circuit board. The R&D for this technique could focus on software that can detect and
store measures for trending. This technique could benefit from the R&D supporting active
measurement methods.
Signal change testing is the first technique with a fairly broad ability to detect a wide range of
degradations. The R&D in this area would need to address sending a pulse signal through
specific circuits to measure the output. The aging trends could use changes in the pulse shape,
reductions (or increase) in the voltage level as indicators of aging. R&D using this technique
would also apply to the next techniques. Detection of capacitor aging is expected to be difficult
using this technique where the next four techniques have a good chance of detecting capacitor
aging degradation.
Signal comparison techniques have been applied to redundant sensors and have detected both
circuit and process degradations. The limitation is that redundant signals are required, and thus
this technique applies only to protection circuits, and only to a portion of the circuit. Other
techniques are needed for non redundant control circuits.
The signature analysis technique is an extension of signal change testing. In this case the R&D
cost is greater because of the need to incorporate hardware that can separate frequency ranges in
the input and output of the measurement system and this most likely includes software. The
technical coverage is improved because this technique can detect capacitor aging degradation.
However, pin pointing the degraded component location within the circuit requires extensive
signature libraries and ability to discriminate between changes to internal circuits and changes in
the external process.
Both pattern recognition and frequency analysis techniques build upon the R&D for signal
change testing and signature analysis techniques. The technical improvement is that a full
spectrum output is obtained. In the case of pattern recognition the spectrum is not controlled by
the input function and; therefore, does not easily represent the circuit transfer function. The
frequency analysis technique provides a uniform active signal to represent uniform voltage input
over the entire frequency range. In this case a true transfer function can be obtained by
inspection. Then changes to the transfer function can be matched to specific circuit degradation
6-7
Qualitative Cost Benefit of Aging Monitoring Techniques
modes. This permits identification of the aging component through changes in break points and
shifts in phase angle in the overall circuit. These techniques offer the greatest ability to detect
capacitor failures.
Figure 6-1 provides a plot of the relative cost versus benefit for each technique as provided in
Tables 5-4 and 6-1 above.
25
Frequency
analysis
Pattern
Relative cost for R&D, implementation and operation
recognition
20
Environmental Passive Signal Signature
parameter parameter comparison analysis
testing monitoring measures
15
Resistance Signal
testing change
testing
Reliability Leakage
Modeling current
10
testing
Visual
Inspections Functional Relative cost versus benefit per circuit for
5 testing indicted technique
0
0 5 10 15 20
Relative technical benefit of techniques for identifying aging degradation before circuit failure
Figure 6-1
Comparison of Relative Cost Benefit Inferred for Each Technique as Applied to a Selected
Circuit
6-8
7
FINDINGS
Summary
This report develops a framework for determining if the existing method for circuit testing
should be upgraded so that the circuit can be monitored for aging degradation. Moreover, the
monitoring results from the improved techniques can be used for trending and evaluation of the
failure probability over the next test interval. This would permit replacement of a circuit board
on the basis of its specific condition rather than a statistically estimated replacement frequency.
The elements of the framework include theories relating the aging of components on circuit
boards to observed failure modes.
• The aging mechanisms in circuit board components can be observed both externally and
internally from circuit changes.
• Most aging mechanisms after the “burn in period” proceed slowly, therefore trends of aging
can be observed before the circuit becomes fully open or shorted. Thus, if the aging effect
can be measured and related to a failure condition, then trending the measures of aging
mechanisms provides a basis for quantifying the probability of failure over a time interval.
• If a method or technique for monitoring circuit behavior can be defined and implemented,
then utilities can make the decision to repair or replace specific circuit boards on the basis of
objective measures rather than replacing the circuit board after failure or on a scheduled
frequency.
The framework includes the use of reliability modeling to estimate circuit board MTBFs during
the operational period.
• The reliability of a circuit is a strong function of its environment and operating temperature.
Thus, when developing reliability-based estimates of MTBF, precise operating conditions of
the circuit are needed, but such measurements are typically not available.
The framework provides categories of methods and techniques that have the potential to detect
aging mechanisms before circuit failure.
• Within six broad methods, twelve techniques are defined to describe the technical advantages
and disadvantages for monitoring aging failure modes of the components on the circuit
board.
7-1
Findings
• The design of potential active and passive measurement systems can be simplified by
permitting the monitoring of each board circuit to be treated as an equivalent circuit with
measurable electronic parameters such as voltage, impedance, resistance, current, and ground
resistance. Changes in these parameters become precursor indications of degradation that
could lead to a complete failure.
The framework provides a decision process for selecting candidate circuits for upgrade from
functional testing to aging monitoring.
• A list of five decision elements and a decision process are provided for considering circuits
that are candidates for upgraded monitoring.
• The decision elements link the assignment of circuit importance developed by utilities to the
selection of methods for upgrading to aging monitoring.
The framework includes a qualitative process for inferring and comparing the cost benefit of
each monitoring technique.
• The techniques are compared from a relative technical merit of being able to detect a wider
range of the observed aging failure modes, and a relative cost for using the technique.
• Twelve techniques were defined in enough detail to provide relative rankings for cost and
technical capability to monitor different aging mechanisms.
• Features of some of the simple techniques can act as building blocks for more complex
techniques.
The following recommendations are provided to help improve the monitoring of circuits to
detect aging mechanisms before a circuit failure occurs.
Recommendations
1. Utilities should continue to systematically categorize the plant electronic circuits by
importance following a systematic process such as described in the maintenance rule (NRC
2000) or using AP-913 (INPO 2001). This provides the basis for utilities to select circuits for
implementation of improved methods for monitoring the degradation of aging mechanisms in
components within the circuits.
2. Initiate R&D programs to develop, test and demonstrate improved monitoring techniques, as
described in this report, for potential application to circuits that are considered to be
important. Utilities are already reviewing plant systems and circuits to establish their
importance. Therefore, R&D programs for improved monitoring methods need to be started
immediately in order for proven techniques to be available for implementation on systems
and circuits that are identified to be important. Based on the information in the previous
sections it is recommended that two R&D approaches would be appropriate at this time, one
for safety related protection circuits and another for control circuits.
a. For protective circuits conduct further R&D that builds upon the existing functional
testing using passive measures of the circuit conditions such as signal comparison if
redundancy exists, measurement of leakage current or resistance.
7-2
Findings
b. For control circuits conduct further R&D that includes the use of active measures
beginning with simple pulses that can be used for signal comparison and grows toward
more complex methods such as signature analysis, pattern recognition and frequency
analysis. In this way the total R&D effort can build upon the success in the previous
methods. This will significantly reduce the cost of R&D for the complex methods.
3. Initiate R&D to establish correlations between the aging measurement trends for selected
monitoring techniques to the likelihood of circuit failure due to aging effects. The biggest
technical difference between the techniques in Section 4 is the ability to precisely detect
aging trends. The ability to develop a mathematical relationship between the measured
parameters associated with the technique and the aging trend in the circuits is vital for
demonstrating the technique’s technical success as a predictive method. The ability to
develop correlations of this type for various failure modes should be part of the engineering
evaluation of any technique. The use of reliability models, the Arrhenius equation concept,
and physics of failure models is expected to provide a starting point for further R&D in this
area.
7-3
8
REFERENCES
Bhakta, S.D., S. Lundberg, G. Mortensen, 2002, “Accelerated tests to simulate metal migration
in hybrid circuits,” 2002 Proceedings Annual Reliability and Maintainability Symposium, 0-
7803-7848-0/02 IEEE, Seattle Washington, January 2002. IEEE Catalog Number: 02CH37318.
pg. 319 to 324, Seattle Washington, January 2002.
DiSandro, R. and R. Torok, 2004 “Managing I&C Reliability and Obsolescence,” American
nuclear Society’s 4th International Topical meeting on Nuclear Plant Instrumentation, Control and
Man-machine Interface Technology, Columbus Ohio, September 19-22, 2004.
EPRI, 2002, “Collected Field Data on Electronic Part Failures and Aging in Nuclear Power Plant
I&C Electronic Boards and Systems;” EPRI, Palo Alto, CA, EDF R&D, France: 2002. TR-
1003568.
EPRI, 2003, “Printed Circuit Board Maintenance, Repair, and Testing Guide,” EPRI, Palo Alto,
CA: 2003. TR-1007916.
EPRI, 2004, “Guidelines for the Monitoring of Aging of I&C Electronic Boards and
Components; EPRI, Palo Alto, CA, EDF R&D, France: 2004. TR-1008166.
Fink, D. C. and H. W. Beaty, 1993, “Standard Handbook for Electrical Engineers,” 13th edition
McGraw Hill, Inc., New York, 1993.
Fukunaga, K, 1990, “Statistical Pattern Recognition,” Second Edition, Academic Press, Inc. San
Diego, CA 1990
IAEA, 2000, “Management of ageing of I&C equipment in nuclear power plants,” TECDOC-
1147, Vienna, Austria 2000.
8-1
References
INPO, 2001, “Equipment Reliability Process Description,” AP-913 (Revision 1). Institute of
Nuclear Power Operations, Atlanta, GA November 2001.
Kim, H., H-J. Jeon, K. Lee, H. Lee, 2002, “The design and evaluation of all voting triple
modular redundancy system,” 2002 Proceedings Annual Reliability and Maintainability
Symposium, 0-7803-7848-0/02 IEEE, pg 439 to 444, Seattle Washington, January 2002.
Kumamoto, H. and E. J. Henley, 1996, “Probabilistic Risk Assessment for Engineers and
Scientists” Second Edition, IEEE press Institute of Electrical and Electronics Engineers, Inc.,
New York, 1996.
Lamarsh, J.R., 1966, “Introduction to Nuclear Reactor Theory,” Addison –Wesley Publishing
Company, Inc. Reading, MA, 1966.
Loman, J., A. Arrao, R. Wyrick, 2003,” Long term aging of electronics systems &
maintainability strategy for critical applications,” Reliability and Maintainability Symposium, 0-
7803-7717-6/03 pg. 328- 331, January 2003.
NEI, 1996, “Industry Guideline for Monitoring the Effectiveness of Maintenance at Nuclear
Power Plants,” NUMAC 93-01 (Revision 2). Nuclear Energy Institute, Washington, D.C. April
1996.
NEI, 2001a, “Industry Guideline for Implementing the Requirements of 10 CFR Part 54 the
License Renewal Rule,” NEI 95-10 (Revision 3). Nuclear Energy Institute, Washington, D.C.
2001.
NRC, 1993, Potential Deficiency of Certain Class 1E Instrumentation and Control Cables NRC
Information Notice 93-33: US Nuclear Regulatory Commission Office of Nuclear Reactor
Regulation Washington, D.C. April 1993.
NRC, 2000, “Maintenance Rule,” NRC Inspection Manual 62706. NRC, Washington, D.C.
December 2000.
NRC, 2001, “Generic Aging Lessons Learned (GALL) Report,” NUREG-1801 v1 and v2
USNRC Office of Nuclear Reactor Regulation Washington, D.C. 2001.
Riddle, J. B., 2004. “Semiconductor wear out at nuclear power plants,” San Onofre Nuclear
Generating Station Private Communication, 2004.
8-2
References
Uhrig, R. E. 1970, “Random Noise Techniques in Nuclear Reactor Systems,” Ronald Press
Company, New York, 1970.
Valentin, R., M. Osterman, B. Newman, 2003, “Remaining life assessment of aging electronics
in avionic applications,” Reliability and Maintainability Symposium, 0-7803-7717-6/03 pg.
313- 318, January 2003.
Walling Dale and Rick Michal, 2005, “Going Digital at Comanche Peak,” Instrumentation and
Controls Special Section, Nuclear News page 33, February 2005.
Wowk, Victor, 1991, “Machinery Vibration: Measurement and Analysis,” McGraw Hill, Inc.
1991.
8-3
A
APPLICATION OF AGING THEORY
For the last three decades, MIL-HDBK-217 has been widely used to predict product reliability.
In many cases the prediction model results are inaccurate when compared with subsequently
measured mean time between failures in the field. One reason for the difference is because the
aging theory used in the MIL-HDBK-217 and similar approaches typically assume a single long-
term aging process that is related to a reaction rate. The aging process model is based on the
original Arrhenius model, which is based on reaction rates in chemical mixtures. This model has
been extended to the breakdown of insulation in electrical systems as the number of discharges
increase over time. A temperature impact model has also been developed to address the increase
in insulation breakdown at elevated temperature. The temperature model has been used to
support accelerated reliability testing.
Purpose of MIL-HDBK-217
This military standard is used to estimate the inherent reliability of electronic equipment and
systems, based on component failure data. It consists of two basic prediction methods:
• Parts-Count Analysis—Requires relatively little information about the system and primarily
uses the number of parts in each category with consideration of part quality and
environments encountered. Generally, the method is applied in the early design phase, where
the detailed circuit design is unknown, to obtain a preliminary estimate of system reliability.
• Part-Stress Prediction—Updates the initial estimate with specific models for stress-analysis,
environmental conditions, quality applications, maximum ratings, complexity, temperature,
construction, and a number of other application-related factors. This method tends to be used
near the end of the design cycle, after the actual circuit design has been defined.
The general failure rate model for a part in MIL-HDBK-217 is of the form:
λ p = λb • π Q • π E • π A • • •
Where: λb = the base failure rate, is described by the Arrhenius equation, and
πQπΕπΑ … = factors related to component quality, environment, and application stress
The Arrhenius equation illustrates the relationship between insulation breakdown rate and
temperature for components. It can be derived from the observed dependence of chemical
reactions on temperature changes.
A-1
Application of Aging Theory
E
−
R(t ) = Ae κ •T
The activation energy for each component can vary, and example of activation energies is shown
in Table A-1. This is a first order impact of the activation energy for failure mechanisms
applicable to microcircuits (Livingston 2000).
Table A-1
First Order Activation Energies
E Activation
Energy
General Failure Mechanism Class Typical (eV)
Surface/Oxide 1
Dielectric Breakdown
Metallization
Wafer fabrication
Chemical contamination 1
MIL-HDBK-217F provides measured data for each part type, such as microcircuits, transistors,
resistors, and connectors. The failure rates of components that are determined by accelerated
testing using a high temperature can be converted to an operational condition at a lower
temperature, by using the Arrhenius equation above to adjust the failure rate to a smaller value.
A-2
Application of Aging Theory
The measured failure times at two different temperatures can be used to solve for the time to
failure given that the time to failure at a measured temperature condition is known.
E ⎡⎛ 1 ⎞ ⎛ 1 ⎞⎤
ln(t R / ti ) = ⎢⎜ ⎟ − ⎜ ⎟⎥
κ ⎢⎣⎜⎝ TR ⎟⎠ ⎜⎝ Ti ⎟⎠⎥⎦
The prediction techniques described in MIL-HDBK-217 for estimating system reliability are
based on the Arrhenius equation, an exponentially temperature-dependent expression, which is a
good predictor of aging of components. It does not address failure modes that are unique shocks
or environmental challenges outside the bounds of the aging model assumptions. For examples,
mechanical vibration and shock, humidity, and power on/off cycling are all independent of
temperature. These are observed causes of failure. Even some temperature-related stresses, such
as temperature cycling and thermal shock, would cause failures that do not follow the Arrhenius
equation.
This application has been used to develop a rule of thumb for electrical insulation – if you can
o
reduce the operating temperature by 10 C the aging mean time between failures is increased
significantly.
References
1. Reliability Prediction of Electronic Equipment, MIL-HDBK-217F 1995, Revision F,
December 1991, Notice 1, 10 July 1992, Notice 2, 28 February 1995.
2. Jones, J. and J. Hayes, “A Comparison of Electronic-Reliability Prediction Models,” IEEE
Transactions on Reliability, Vol. 48, No. 2, June 1999, pp. 127-134.
3. Livingston, Henry, “SSB-1: Guideline for using Plastic Encapsulated Microcircuits and
Semiconductors in Military, Aerospace and Other Rugged Applications” Proceedings of
“DMSMS 2000”, August 24, 2000.
A-3
B
DEFINITIONS OF COMPONENTS
This appendix describes the components found on typical circuit boards using in electronic
systems. The definitions will help understand the way that systems can be interrogated to
monitor aging issues
Capacitor – a device for storing an electrical charge. In its simplest form a capacitor consists of
two metal plates separated by a non-conducting layer called the dielectric. When one plate is
charged with electricity from a direct current or electrostatic source, the other plate will have
induced in it a charge of the opposite sign.
Circuit Board – a flat piece of nonconductive material on which computer microprocessors and
other electric components are mounted and electrically connected by thin strips of metal.
Diode – an electronic device that allows the passage of current in only one direction. The diodes
most commonly used in electronic circuits today are semiconductor diodes.
Dynamic RAM (DRAM) – a form of semiconductor random access memory (RAM). Dynamic
RAMs store information in integrated circuits that contain capacitors. Because capacitors lose
their charge over time, dynamic RAM boards must include logic to “refresh” (recharge) the
RAM chips continuously. While a dynamic RAM is being refreshed, the processor cannot read
it; if the processor must read the RAM while it is being refreshed, one or more wait states occur.
Because their internal circuitry is simple, dynamic RAMs are more commonly used than static
RAMs, even though they are slower. A dynamic RAM can hold approximately four times as
much data as a static RAM chip of the same complexity.
B-1
Definitions of Components
Integrated Circuit – a tiny electronic circuit used to perform a specific electronic function, such
as amplification; it is usually combined with other components to form a more complex system.
It is formed as a single unit by diffusing impurities into single-crystal silicon, which then serves
as a semiconductor material, or by etching the silicon by means of electron beams. Several
hundred identical integrated circuits (ICs) are made at a time on a thin wafer several centimeters
wide, and the wafer is subsequently sliced into individual ICs called chips.
Light-emitting diodes (LEDs) – a voltage applied to the semiconductor junction results in the
emission of light energy. LEDs are used in numerical displays such as those on electronic digital
watches and pocket calculators.
Oscillators – devices that generally consist of an amplifier and some type of feedback. The
output signal is fed back to the input of the amplifier. The frequency-determining elements may
be a tuned inductance-capacitance circuit or a vibrating crystal. Crystal-controlled oscillators
offer the highest precision and stability.
Potentiometers – resistors with adjustable resistance. These types of resistors are used in
applications when the current needs to be adjusted or when the resistance needs to be varied, as
with lights that dim or adjustable generators.
Printed Circuit Board (PCB) – a flat board made of non-conducting material, such as plastic or
fiberglass, on which chips and other electronic components are mounted, usually in predrilled
holes designed to hold them. Predefined conductive metal pathways that are printed on the
surface of the board electrically connecting the holes that hold components on a printed circuit
board. The metal leads protruding from the electronic components are soldered to the conductive
metal pathways to form a connection. A printed circuit board should be held by the edges and
protected from dirt and static electricity to avoid damage.
PROM – read-only memory. A type of read-only memory (ROM) that allows data to be written
into the device with hardware called a PROM programmer. After a PROM has been
programmed, it is dedicated to that data, and it cannot be reprogrammed. Because ROMs are
cost-effective only when produced in large volumes, PROMs are used during the prototyping
stage of the design. New PROMs can be created and discarded as needed until the design is
perfected.
RAM – random access memory. RAM is semiconductor-based memory that can be read and
written by a microprocessor or other hardware device. The storage locations can be accessed in
any order. Note that the various types of ROM memory are capable of random access, which can
be written as well as read.
B-2
Definitions of Components
Resistor – component of an electric circuit that resists the flow of direct or alternating electric
current. Resistors can limit or divide the current, reduce the voltage, protect an electric circuit, or
provide large amounts of heat or light.
SCSI – small computer system interface. A SCSI interface is used to connect microcomputers to
peripheral devices and to other computers and local area networks. Up to seven devices, such as
printers or control devices not including the computer can be attached through a single SCSI
connection (port) through sequential connections. Each device has an address (priority number).
Only one device at a time can transmit through the port; priority is given to the device with the
highest address. A SCSI port is standard on higher computers. A SCSI interface can be installed
in IBM PC and compatible computers as an expansion board. A committee of the American
National Standards Institute (ANSI) defines the standards for high-speed parallel interface.
Static RAM (SRAM) – a form of semi conductor memory (RAM). Static RAM storage is based
on a logic circuit known as a flip-flop, which retains the information stored in it as long as there
is enough power to run the device. A static RAM chip can store only about one-fourth as much
data as a dynamic RAM chip of the same complexity, but static RAM does not require refreshing
and is usually much faster than dynamic RAM. It is also more expensive. Static RAMs are
usually reserved for use in caches.
Switching and timing circuits – these logic circuits form the heart of any device where signals
must be selected or combined in a controlled manner. Applications of these circuits include
telephone switching, satellite transmissions, and digital computer operations.
Transformer – an electrical device usually consisting of two adjacent coils of wire wound
around a single core of magnetic material. A transformer is used to couple two or more AC
circuits by employing induction between the coils.
Transistor – a common name for a group of electronic devices based on n-p-n or p-n-p junctions
used as amplifiers or oscillators in communications, control, and computer systems. Groups of
transistors can be organized into integrated circuits.
B-3
Export Control Restrictions
Access to and use of EPRI Intellectual Property is granted
with the specific understanding and requirement that
responsibility for ensuring full compliance with all applicable
U.S. and foreign export laws and regulations is being
undertaken by you and your company. This includes an
obligation to ensure that any individual receiving access
hereunder who is not a U.S. citizen or permanent U.S.
resident is permitted access under applicable U.S. and foreign
export laws and regulations. In the event you are uncertain
whether you or your company may lawfully obtain access to
this EPRI Intellectual Property, you acknowledge that it is our
obligation to consult with your company’s legal counsel to
determine whether this access is lawful. Although EPRI may
make available on a case by case basis an informal assessment
of the applicable U.S. export classification for specific EPRI
Intellectual Property, you and your company acknowledge
that this assessment is solely for informational purposes and
not for reliance purposes. You and your company
acknowledge that it is still the obligation of you and your
company to make your own assessment of the applicable
U.S. export classification and ensure compliance accordingly.
You and your company understand and acknowledge your
obligations to make a prompt report to EPRI and the
appropriate authorities regarding any access to or use of
EPRI Intellectual Property hereunder that may be in violation
of applicable U.S. or foreign export laws or regulations.
Together…Shaping the Future of Electricity Printed on recycled paper in the United States of America
Electric Power Research Institute • 3412 Hillview Avenue, Palo Alto, California 94304 • PO Box 10412, Palo Alto, California 94303 • USA
800.313.3774 • 650.855.2121 • askepri@epri.com • www.epri.com