Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Track 1 - Cesar Malpica - Paper

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

SAFETY AND RELIABILITY; HOW DO THE SAFETY AND RCM ANALYSES

CAN WORK TOGETHER?

César Malpica, CMRP Oswaldo Moreno, TUV Ernesto Primera, CMRP


Staff Reliability Engineer SIS Engineer Reliability Engineer
Chevron Chevron Chevron
Energy Technology Company Energy Technology Company Latino America Business Unit

Introduction

Reliability Centered Maintenance (RCM) has been used in several industries (i.e. aeronautical,
nuclear, oil and gas) for very long time with documented success to provide facilities with the
most cost benefit asset strategies. Although there are many other times that RCM hasn’t been
able to fulfill customer’s expectations (customer is also referred as “user”) for effectiveness and
cost reduction, it still meets the definition of a best practice — a process that, if repeated, will
consistently provide the desired outcome. The RCM reached a high level of maturity and already
went over a learning curve that made its implementation either a well-known common practice
or a practice to avoid due to some very unsuccessful projects. However, RCM has been adopted
by the majority of users (i.e. airlines, nuclear power plants, petrochemical complex and O&G
facilities), and it is fair to say it is a best practice, even it is no longer a differentiator.
RCM implementation requires that many parties interact and address a broad range of asset types
and organizational processes. RCM is great to manage the collective knowledge of asset’s team
by expert elicitation and performance review. As successful you are identifying and mitigating
RCM failure modes, as successful you will be on managing a safe and reliable and profitable
asset.
In this paper, the author shares a modern approach to perform RCM in an integrated manner with
other industry standard and best practices related to safety; process hazard analysis. The silos-
based organization has the opportunity to promote a safer and more efficient facility if reliability
and safety related activities are planned and performed in such a manner that both are as much as
possible integrated and supported by each other.
Failure on implementing the reliability and safety programs (e.g. asset integrity ) can conduct to
very critical accidents like the ones registered in the Gulf of Mexico and other locations in the
past years; these accidents affected the assets, injured people and the nature was heavily
impacted. It happened due to a lack of the implementation of reliability and asset integrity
programs. It “must” be avoided. It “Can” be avoided. An “incident-free” operation is achievable.
The Process Hazard Analyses (PHA) provides the engineering team with the identification of
those critical hazard scenarios that “do have or don’t have” enough safeguards in place to either
mitigate or avoid the occurrence of such scenarios.
The Safety Integrity Level (SIL) analysis departs from the process hazards identification and the
validation of the safeguards to recommend what should be done from design, testing and

C. Malpica & O. Moreno & E. Primera Page 1 of 10


Chevron – October 2014
maintenance to accredit such safeguards as one of the required “Independent Protection Layer”
for the analyzed hazard scenario.
While the RCM has become in one of the most recognized “Reliability Best Practices” endorsed
by SAE J1011 and ASTM, the PHA/SIL analysis has been endorsed by the IEC61508/IEC61511,
API14C and other organizations to be a “Must-to-Do” in process plants. Both, the RCM and the
PHA/SIL are founded on the concept of becoming part of the life cycle of the assets (“living
program”).
Definitions
The following definitions will help as we begin our understanding of RCM:

Reliability-Centered Maintenance (RCM): A systematic process used to determine what must be


done to ensure that any physical asset continues to fulfill its intended functions in its present
operating context.
Failure Mode and Effects Analysis (FMEA): A design evaluation procedure used to identify
failure modes and to determine the effect of each on system performance; a procedure in which
each potential failure mode in every sub-item of an item is analyzed to determine its effect in
other sub-items and on the required function of the item.
Predictive Maintenance (PdM): The use of modern measurement and signal processing
techniques to accurately diagnose the condition of equipment (level of deterioration) during
operation. The PdM is accomplished by the periodic measurement and trending of process or
machine parameters with the aim of predicting failures before they occur. The objective is to
predict or anticipate when maintenance is required through condition monitoring of equipment.
Examples are vibration monitoring, lubricant analysis and leak detection.
Preventive Maintenance (PM): Actions performed in an attempt to retain an item in specified
condition by providing systematic inspection and detection. The goal is to control potential
failures in their early stage and includes basic housekeeping, periodic/systematic inspection,
detection and daily routine maintenance (e.g., adjustments, replacements) to prevent
deterioration.
Failure-Finding Task (FF): A failure-finding task is defined as a scheduled task used to
determine whether or not an item is able to fulfill its intended function when demanded. It is
solely intended to reveal if a specific hidden failure has occurred. Failure-finding tasks usually
apply to protective devices that fail without notice.
Corrective Maintenance (CM): Unplanned maintenance tasks performed to restore the functional
capabilities of failed or malfunctioning equipment.
One-Time Task: It may be desirable to reduce the risk of failure by recommending redesign of
the asset or to propose modification of existing equipment, operating procedures, or spare supply
strategy. These recommendations are called “one-time tasks.”
Run-to-Failure: No scheduled maintenance. No effort to anticipate or prevent failure modes that
do not have intolerable impact. The failure mode is simply allowed to occur and then repaired.

C. Malpica & O. Moreno & E. Primera Page 2 of 10


Chevron – October 2014
Criticality Analysis: An assessment of the impacts on safety, environment and production that
can occur if an asset fails. It is intended to determine the risk associated with such failure and
prioritize under a risk-cost-benefit concept the actions needed to either mitigate the impact or
reduce the probability of its occurrence to tolerable levels.
Availability: The ability of an item to be in a state to perform a required function under given
conditions at a given instant of time or over a given time interval, assuming that the required
external resources are provided.
Maintainability: Ability of an item under given conditions of use, to be retained in, or restored to,
a state in which it can perform a required function, when maintenance is performed under given
conditions and using stated procedures and resources.
Spare Parts Optimization: Risk-cost-benefit study performed to determine which parts and how
many of them should be stocked to assure the maintainability and optimum availability of
process plants.
Task Selection: Process followed during RCM for deciding which of the proactive tasks (if any)
is technically feasible in any context and, if so, for deciding how often they should be done and
who should do them.
Task Analysis: Task analysis is the analysis of how a task is accomplished, including a detailed
description of both manual and mental activities, task and element durations, task frequency, task
allocation, task complexity, environmental conditions, necessary clothing and equipment, and
any other unique factors involved in or required for one or more people to perform a given task.
Reliability Centered Maintenance
Overview
The primary objective of any RCM process is to preserve system functions. A system function is
typically defined in terms of system output, throughput, safety and environment or even cost.
Preserving system function may then be reasonably defined as doing what is required to keep the
overall system operating to the level where it meets its performance targets (i.e., prescribed
output, throughput, asset integrity).
Maintenance strategies such as preventive maintenance (PM) and predictive maintenance (PdM)
among others are developed for each asset within the facility only after evaluating the role of it
in preserving the system function. Once the maintenance strategies are developed, specific
maintenance tasks and monitoring practices are developed based upon documented and traceable
decisions of what should be done, why and by whom.
It is also possible that, based on a risk-cost-benefit analysis it is decided not to do any proactive
task to prevent the failure (run-to-failure strategy). This is an acceptable strategy if the cost of
doing the proactive maintenance to avoid the failure is higher than the impact if the failure
occurs.
The RCM methodology can be applied in any upstream or downstream facility where operations,
maintenance and reliability are interested in starting an effort to assure the optimum availability
of processes and reliability of assets.

C. Malpica & O. Moreno & E. Primera Page 3 of 10


Chevron – October 2014
References
The following documents provide information and support the implementation of RCM
programs in the industry:
 CEI IEC 60300-3-11 International Standard, Dependability Management –
Part 3–11: Application Guide – Reliability-Centered Maintenance, First Edition, 1999-03
 SAE JA1011, Evaluation Criteria for Reliability-Centered Maintenance (RCM) Processes
 SAE JA1012, A Guide to the Reliability-Centered Maintenance (RCM) Standard
 Moubray, John; Reliability-Centered Maintenance RCMII, 1997
Procedure
In order for the process to work properly, the answers to seven questions are required as input to
the process:
1. What are the functions and associated performance standards of the asset in its
present operating context?
2. In what ways does it fail to fulfill its functions?
3. What causes each functional failure?
4. What happens when each failure occurs? What is the effect of the failure?
5. In what way does each failure matter? Is it evident for the operator under normal
operating conditions?
6. What can be done to predict or prevent each failure?
7. What should be done if a suitable proactive task cannot be found?

Figure 1: RCM Procedure

C. Malpica & O. Moreno & E. Primera Page 4 of 10


Chevron – October 2014
Process Hazard Analysis (HAZOP)
Overview
Process Hazard Analysis (PHA) is a generic term used for describing qualitative risk assessments.
PHA may use various methodologies, such as Hazard and Operability (HAZOP) or What/If
Checklist review. PHAs are key components for implementation of Process Safety Management
process in the Oil & Gas industry.
Hazard and operability study (HAZOP) is a structured and systematic technique for examining a
defined system, with the objective of:
 Identifying potential hazards in the system
 Identifying potential operability problems with the system and in particular identifying
causes of operational disturbances and production deviations likely to lead to non-
conforming products.
An important benefit of HAZOP studies is that the resulting knowledge, obtained by identifying
potential hazards and operability problems in a systematic manner, is of great assistance in
determining appropriate remedial measures.
A characteristic feature of a HAZOP study is the “examination session” during which a multi-
disciplinary team under the guidance of a study leader systematically examines all relevant parts
of a design or system.
References
The following documents provide information and support the implementation of PHA / HAZOP
programs in the industry:
 CEI IEC 61882 Hazard and Operability Studies – Application guide, First Edition, 2001-
05
 IEC/ISO 31010 Risk Assessment techniques, First edition, 2009-11
 IEC61160 Design Review, Second edition, 2005-9
Procedure
The steps to conduct a HAZOP study can be listed as follows:
 Choose study node
 Understand the design intend
 Brainstorm causes of deviations from normal operation
 Develop each cause to its global consequence(s)
 For those scenarios having consequences of interest, identify existing safeguards
 For those scenarios having consequences of interest, qualitatively assess the severity of
the predicted consequences and the likelihood that those consequences will develop, and
determine the Risk Priority Rating of the scenario
 If warranted, make recommendation(s) to reduce risk
 Repeat process for each consequence identified and / or study section
Table 1shows a typical form to gather information and document the results of HAZOP.

C. Malpica & O. Moreno & E. Primera Page 5 of 10


Chevron – October 2014
Relationship between HAZOP and RCM
The HAZOP can be either complemented or in conjunction with FMEA what is also a
constituent part of RCM. The process followed to identify deviations / causes and assess
consequences may be understood as the one required by RCM for failure mode / cause and effect
/ consequence respectively. Figure #1 shows where both HAZOP and RCM can be interfacing to
meet their own objectives for design review and further asset strategies development.

Figure 2: Hazard and Operability Study Procedure

Safety Integrity Level (SIL)


Overview
Safety Integrity Level (SIL) is a formal process that utilizes the results of a PHA to determine the
safety system instrumentation required to prevent and mitigate hazardous events in downstream,
chemical, and upstream onshore processes. The SIL study relies on the hazardous scenario
causes, consequences, and safeguards developed during the PHA (HAZOP) as a starting point
for determining whether adequate risk reduction has been provided for each cause-consequence
scenario using Independent Protection Layers (IPLs) and recommending additional measures to
further mitigate the risk as necessary.
The SIL study is a structured process to validate the IPL safeguards and determine the
performance required of Safety Instrumented Functions (SIFs).

C. Malpica & O. Moreno & E. Primera Page 6 of 10


Chevron – October 2014
References
The following important influential documents / standards provide information and support the
implementation of SIL study in the Oil & Gas industry:
 IEC 61508: Functional Safety – Safety Related Systems- 1998 / 2000
 IEC 61511: Functional Safety: Safety Instrumented Systems for the Process Industry
Sector: Parts 1, 2 & 3 –June 2003
 ANSI/ISA 84.00.01-2004 (IEC 61511 MOD) – Functional Safety: Safety Instrumented
Systems for the Process Industry Sector – Parts 1, 2 & 3 with Grandfather Clause
Procedure
The steps to conduct a SIL study may differ from every user’s procedure however; basic steps
based on LOPA methodology (Layers Of Protection Assessment) are listed below:
 Complete all HAZOP steps (see HAZOP section above)
 The cause, consequence, safeguards, and severity ranking for each cause-consequence
scenario are copied over from the PHA worksheets (HAZOP).
 Determine Total number of IPL (Independent Protection layer) credits required
 Identify all IPLs
 Determine IPL credit gap
 Create SIL Recommendations
 Repeat until all hazards are evaluated
Independent Protection Layer
Safeguards are credited as IPL if meet the following criteria (per guidance from the Center for
Chemical Process Safety):
 Specificity (designed specifically to prevent or mitigate the consequences of one event)
 Independence (independent of other protection layers for the hazard)
 Dependability (will do what it is designed to do)
 Auditability (designed to accommodate required testing and maintenance for regular
validation)
To determine whether or not a safeguard qualifies as an IPL, consider the “3 D’s” and the “Big I”
as follows:
1. An IPL must have the following “3 D’s”:
a. Detect—Sense a condition or problem,
b. Decide—Whether an action or intervention is necessary or not,
c. Deflect—If action is required; the action must be capable of preventing the
consequence in a timely manner.
2. And, an IPL must have the “Big I”:
a. Independent of initiating cause
b. Independent of other IPLs

C. Malpica & O. Moreno & E. Primera Page 7 of 10


Chevron – October 2014
SIL Verification
Safety Integrity Level (SIL) verification is required to ensure that target SIL for each SIF is
achieved by the proposed conceptual design and according to other requirements established in
the SIS functional requirements.
The procedure, scope, frequency and documentation of functional testing are critical aspects of
SIL verification. The target SIL and achieved SIL will be used to determine or adjust the scope
and frequency of functional testing. The RCM team should use the SIL functional testing as an
input for the tasks selection exercise.
Special Case: API14C covered facilities
API14C is a recommended practice for analysis, design, installation, and testing of basic surface
safety systems for Offshore production platforms.
The main objectives API14C are:
 To prevent release of hydrocarbons from the process and minimize adverse effects
 Provides Requirements for Operating Modes of the Safety System
 Standardized Design and Testing
API14C requires the planning and execution of prescriptive tasks for functional testing of safety
devices. The Integrity of offshore facilities is achieved by prescriptive high frequency testing that
has no recognition of magnitude of risk.
Relationship between SIL and RCM
The SIL can either complement or be done in conjunction with RCM as both provide asset
strategies for operations and maintenance. The process followed to identify failure finding and
functional testing may be understood as required by RCM and SIL studies. Figure #3 shows
where both SIL and RCM can be interfacing to meet their own objectives for asset strategies
development.
RCM can’t overwrite API14C prescriptive tasks as they are mandatory in USA and many other
countries or transnational companies have adopted to set minimum requirements on scope and
frequency of functional testing. The RCM facilitator should work with the team members to
know if API14C is followed for the installation subject of RCM.
The Figure 3 shows how SIL study and RCM can be related to each other.

C. Malpica & O. Moreno & E. Primera Page 8 of 10


Chevron – October 2014
Figure 3: Safety Integrity Level Analysis Procedure

Lessons learned
The integration of safety related studies and reliability have allowed the project team to
document the following lessons learned that can be now shared with other project teams,
business units and peers in the O&G industry.
 Plan is needed for both reliability and safety related tasks early in each project execution
phase (if not complete late in previous phase) to ensure that funds and resources are
allocated to fully support the successful implementation
 Reliability and Safety teams should ensure that scope and schedule and expectations are
not only shared but adopted by each team in order to ensure a seamless integration and
delivery of safer and more reliable assets
 If it is well coordinated, the execution of reliability and safety related workshop will
provide inputs to each other and optimize the resource allocation as well as ensure
consistency across the studies and accuracy of results (risk assessment and asset
strategies)
 Third parties and contractors need to be educated on how to deliver an integrated
approach for reliability and technical safety, Currently, the silos-based organization does
not only belong to users but it is also replicated in most of the reliability and service
providers. It is common to find contractors that only provides specialized services on the
application of a single methodology but don’t go beyond that to offer a more holistic and
integrated reliability program implementation
 The leverage from existing industry standard and best practices allows the
implementation of programs that cover the entire value chain of reliability engineering

C. Malpica & O. Moreno & E. Primera Page 9 of 10


Chevron – October 2014
(e.g. ISO20815, API17N, IEC60300). The current experience in the O&G industry
provide frame and add purpose to the execution of reliability studies as they are now
more oriented to safety and business continuity.
Summary
There are some clear and compelling interactions between RCM and PHA studies regarding the
identification of critical failure modes, causes and definition of tasks to either mitigate the risk or
reduce to a tolerable level the probability of its occurrence. Failing to adopt an integrated
approach to perform safety related studies (PHA) and reliability analyses, will avoid the team to
achieve optimum resource allocation, consistency of results across the studies (i.e. risk
assessment, failure mode) and most effective tasks to mitigate associated risks.
The Figure 4 shows the lessons learned derived from the application of safety-focused RCM.
Benefits have been well recognized for closing the gap between reliability and Safety teams and
processes.

Figure 4: Lessons learned from the application of HES focused RCM


Keywords:
1. Reliability
2. Safety
3. Process Hazard Analysis
4. Maintenance
5. SIL Safety Integrity Level
6. Reliability Centered Maintenance

C. Malpica & O. Moreno & E. Primera Page 10 of 10


Chevron – October 2014

You might also like