Proficy HMI - SCADA Cerver Redundancy

Proficy HMI/SCADA – Cimplicity
Server Redundancy Guide

Server Redundancy Outline
What is Server Redundancy…………………………………………………. 3

Basic Configurations…………………………………………………………. 4
Server Redundancy Device Communications…………………………… 14
Event Manager…………………………………………………………………. 17
How Broadcast works to Viewers………………………………………….. 19
How the Datalogging Functions……………………………………………. 26
The DataMerge Utility………………………………………………………… 30
Things to Avoid……………………………………………………………….. 35
2
Server Redundancy Outline
Server Redundancy is the ability of having two servers acting together to run a
single project. They work in a Master/Slave pair with one node always acting as
the current hot standby to the other. When a failure is detected on the current
active master, then the slave asserts itself as the new master, and all client
applications (i.e. Viewers) transition their requests to the new master (with some
limitations).
This all happens without user interaction.
3
Server Redundancy Simple Layout
Network
CIMPLICITY Server CIMPLICITY Server

Primary Secondary
PLC
Server Redundancy is a “Hot Standby” of a currently active

CIMPLICTY Server. It is comprised of two servers (a Primary and a
Secondary system). In it’s simplest form there are two Servers with
single connections to PLC’s.
4
Server Redundancy Moderate Layout
Viewer
Network

Primary Secondary
PLC
Moderate configurations contain at least one Viewer on the network

that the Redundant Servers are communicating to.
5
Server Redundancy Complex Layout
Viewer PLC
Network A
Network B
Primary Secondary
A complicated layout would be considered to be one that has multiple

viewers, multiple PLC’s, and is using Cable redundancy for Server
Redundancy, as well as PLC Cable Redundancy.
6
Server Redundancy Configuration Requirements
What is required for Server Redundancy?
• The Primary must be installed/licensed as Server Development.

• The Secondary must have at least a Server Runtime license with
the same point count as the license on the Primary.
• Both nodes require the “Server Redundancy” option license
• The systems should be similar in hardware specifications and
limitations.
• Both Servers will require their own database (MSDE or MS SQL)
• Servers must be on the same network without a router between
them. This is for Viewer to Server Communications.
• Both nodes must be the exact same Version and Service Pack.
• On versions prior to Proficy HMI/SCADA – CIMPLICITY 7.0 or
greater, Basic Control Engine (BCE) must be Enabled as a project
option.
When you enable Server Redundancy in a project what happens?
The Workbench imports and adds a class called “GefRedundancy” and

creates an instance of this class as an object called “Redundancy”. This object
is visible from the workbench and has the fields filled in with the Primary and the
Secondary computer names. In addition, this class has some scripts that are
added to the Event Editor by the class (hence why the BCE is a required option).
NOTE:
Prior to CIMPLICITY 5.0 this class/object did not exist. When upgrading a project
from a version prior to CIMPLICITY 5.0 or greater, it must be opened in the
Workbench, the Server Redundancy option must be removed, and then the
Server Redundancy option must be added.
This will add the new GefRedundancy class and Redundancy object.
7
Server Redundancy Configuration
How is it configured in the Workbench?
• On the Secondary in the Redundant pair you create a Windows

network share. This is the location where the project will reside on
the secondary.
• On the Primary in the workbench you enable Server Redundancy
and point it to the network share on the Secondary using a mapped
network drive.
NOTE:
As of Proficy HMI/SCADA – CIMPLICITY 7.0 or greater, you are allowed
to use a UNC path (i.e. \\computer\share) instead of a mapped network drive (i.e.
Z:\)
8
Server Redundancy Configuration Update
Primary Secondary
(Local) (Network Share)
Project Project
alarm_help alarm_help
arc arc
data data
lock lock
Copy/Config Update
log log
master
screens screens
scripts scripts
Project.gef Project.gef
When you do a configuration update on the Primary’s Workbench, the following

occurs:
• The folders for the project (with the exception of the /master folder)
are copied to the Secondary Network share.
• The files in the /data folder are modified so that the system knows
that this is a Secondary in a redundant pair (i.e. no modifications
allowed).
• The configuration update is done under the security context of the
Windows User who is using the Workbench on the Primary (relates
to Network Permissions).
9
Server Redundancy Functionality
10
Server Redundancy Core Processes
Primary Secondary
Logical Link
Master Alarm Master User Slave User Slave Alarm

Manager Registration
Logical Link Registration Manager
Master Point Slave Point

Manager
Logical Link Manager
Master Derived Slave Derived

Point Process Point Process
In order to achieve Server Redundancy CIMPLICITY has to make sure that all of
the core processes between the Primary and the Secondary are synchronized.
This means that they must have the same data and must be in the same state
with regards to that data. This is achieved by setting up Logical Links (i.e. direct
connections) between core processes on the nodes.
For each of these core processes there is a master process and slave process.
11
Server Redundancy - The RtrPing Process
How do the Servers detect that a failure has occurred?
The Primary and Secondary are constantly checking to make sure that the other
node is still there. They do this by using a process called “rtrping”. This is a
process that’s sole responsibility is to actively ping, and respond to pings, to/from
the other node in the redundant pair. If a ping does not come back in the
appropriate timeframe then an automatic failover is initiated.
What port do the Servers ping on?
They ping each other on TCP/IP port 4000 (by default). It is configurable to
change which port they ping each other on by changing the
REDUND_PROBE_PORT global parameter. There is further information in the
help files on this.
Primary Secondary
W32RTR W32RTR
Ping
RTRPing RTRPing
TCP/IP Port 4000 TCP/IP Port 4000
Ping
The RtrPing processes ping each other periodically to confirm that the other node
is still there and is still active
12
Server Redundancy Failover Time
How long does it take between a failure and it being detected

and failover being initiated?
The RtrPing utility will ping the other node every 5 seconds (by default). This time
is configurable by the REDUND_PROBE_INTERVAL global parameter. RtrPing
will wait for the time to be exceeded and then will go through the retries. The
number of retries is 1 more than the REDUND_PROBE_COUNT global
parameter (which has a default value of 3).
So the formula to detect a failover time is:
= REDUND_PROBE_INTERVAL * (REDUND_PROBE_COUNT + 1)
By default this is:
= 5 seconds * (3 retries + 1) = 20 seconds.
For larger applications this value must be tuned.
13
Server Redundancy Device Communications
How are Device Communications Updates dealt with?
When using Device Communications drivers in a Redundant Server

configuration, only the updates from the master Device Communications driver
are actually used.
Remember that the Point Management processes are communicating to each

other via Logical Link. To keep duplicate updates from coming in from the Device
Communications processes (which are talking to the same PLC on both nodes),
only the updates that come in from the master node are allowed to propagate to
the PTM process – and thereby across to the slave PTM process.
What happens to the Device Communications updates on the

Slave?
The Device Communications driver is still actively polling the PLC on the slave,
but it’s updates are simply ignored by the Slave Point Management process.
NOTE:
In general this is true, but there are exceptions to this for Unsolicited
communications and OPC Client.
14
Server Redundancy - Device Communications Overview
Poll Poll
PLC
Master Device Slave Device

Communications Communications
Update Update

Manager
Primary Secondary
A close up view of Device Communications behavior in Server Redundancy
15
Server Redundancy - Device Communications Detail
Poll Poll
PLC
1
Slave Device
Master Device
2 Communications
Communications
Update Update
4

3 Manager
Primary Secondary
1. Both Device Communications drivers poll the PLC
2. The Master Device Communications driver processes the update and

passes it to the Master Point Management (PTM) Process via the
Device Communications Queue (DCQ).
3. The Master PTM Process receives the update and sends it out on the
wire to the slave PTM process.
4. The Slave Device Communication Process passes it’s update to the

Slave PTM process, but it is ignored.
5. The Slave PTM process receives the update from the Master PTM
Process and uses this update to update its client processes.
16
Server Redundancy - Event Manager Behavior
When using the Event Manager in a Server Redundant configuration it is

imperative to program around the limitations and understand the behavior of it in
this context.
The following things should be considered/kept in mind, when designing a

Redundant Server application that uses the Event Manager.
• Event Manager runs on both the master and the slave at the same
time.
• Setpoint requests are always ignored on the Slave EMRP/PTM
process
• Scripts on the Slave execute just like they would on the Master
node.
• A check should be put into the scripts to validate whether or not the
script is executing on the master. If not, then ideally it should
terminate, or do something innocuous. This is entirely at the
discretion of the user’s programming.
NOTE:
There is a function called “CimIsMaster()” that can be used in scripting to
detect if it is executing on the current master.
17
Server Redundancy - Event Manager Detail
Poll Poll
PLC
1
Slave Device
Master Device
Communications
Communications
2 Master Point Slave Point

Manager
Master Event Slave Event

3 Manager Manager 4
Primary Secondary
1. Both Device Communications drivers poll the PLC. The Master Device
Communications driver processes the update and passes it to the Master
Point Management (PTM) Process via the Device Communications Queue
(DCQ).
2. The Master PTM Process receives the update and sends it out on the wire
to the Slave PTM process.
3. The Master EMRP process runs whatever is configured in it. Set points
are written to the Master PTM process.
4. The Slave EMRP process runs (just like the master) and executes.
However, all set point requests are ignored.
18
Server Redundancy - Viewer Communication
19
UDP Broadcast
Router (W32RTR) Cache
Project Computer Version
Master
UDP Broadcast
Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer
Slave
Node: Secondary
Project: CIMPROJ
Version: 6.1.5383
Step 1
A Redundant Server pair starts up. Immediately the Master starts to
broadcast its information onto the network via a UDP Broadcast (port
32000).
20

Viewer PTM CIMPROJ Primary 6.1.5383

Communications
Master
Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer
Slave
Node: Secondary
Project: CIMPROJ
Version: 6.1.5383
Step 2
The Viewer updates its cache with the relevant data and directs its PTM
communications to the active master.
21

Viewer PTM CIMPROJ Primary 6.1.5383

Communications
Slave
Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer
Master
Node: Secondary
Project: CIMPROJ
Version: 6.1.5383
Step 3
The Secondary transitions to be the master due to a failover
22

CIMPROJ Secondary 6.1.5383
Slave
Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
UDP Broadcast Viewer
Master
Node: Secondary
Project: CIMPROJ
Version: 6.1.5383
Step 4
The Secondary starts to broadcast his UDP broadcast, with the
updated computer name.
23

CIMPROJ Secondary 6.1.5383
Slave
Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer
Viewer PTM
Master
Communications
Node: Secondary
Project: CIMPROJ
Version: 6.1.5383
Step 5
The Viewer detects the change and redirects its PTM communications to
the new active master.
24
Server Redundancy - Viewer Communication Limitations
Redundant Servers are still able to do Viewer to Server communication.

However, there are a few limitations.
• The only broadcast mechanism is UDP Broadcasts (standard

Project Broadcast)
• Multicast is not supported (will not function)
• Viewer-to-Server by Nodename can function, but has severe
restrictions. These restrictions include the inability to automatically
fail over to the active master when the Redundant pair failovers.
This also requires additional configuration.
NOTE:
This is NOT recommended and should only be considered for special situations.
How does the Project Broadcast mechanism work?

When a project first starts on the master node, it immediately sends out a Project
Broadcast (UDP port 32000) on the network.
The broadcast contains the following:

• Project Name
• Computer Name
• CIMPLICITY Version
This broadcast will go out once every 90 seconds by default, unless a Viewer
requests it sooner. When a viewer receives the broadcast it will connect to both
of the servers in the redundant pair via a TCP/IP connection and directs it’s PTM
requests to the current active master. When a failover occurs, the new master
immediately starts to broadcast and the slave will halt broadcasting. The Viewers
local PTM process then directs it’s activities to the new active Master.
25
Server Redundancy - Database Logging
26
Server Redundancy - Database Logger
The Data logging capability of a Server Redundant pair is generally not very well
understood. This is mainly due to the fact that the Data logger prompts you for all
of the DSN’s to be used on the primary and the secondary.
Rules of Server Redundant DataLogging:
1. Logging is done in parallel between the Primary and the Secondary

(regardless of who is the current active master).
2. The Primary only logs to the “Master” ODBC DSN’s. The Secondary only
logs to the “Slave” ODBC DSN’s. In this context they are a misnomer.
3. The ODBC Data sources “master” and “slave” must be identical between
the Primary and Secondary. (i.e the “master” DSN must point to the same
Database regardless of what node the DSN is configured on).
4. The DSN’s on the Primary and Secondary must have the exact same
names.
The Primary only logs to the “Master” ODBC DSN’s.

The Secondary only logs to the “Slave” ODBC DSN’s
27
Server Redundancy - Database Logging
Primary Secondary
Logical Link
Master Alarm Master User Slave User Slave Alarm

Manager Registration
Logical Link Registration Manager

Manager
Master DL Master PTDL Slave PTDL Slave DL
Master Master Slave Slave

Alarm Logging Point Logging Point Logging Alarm Logging
Server Redundancy Data Logging is done in PARALLEL and is done regardless

of who is master or slave.
28
Server Redundancy - Database Logger
Why configure the DSN’s on both nodes if only one of them is ever going to
actually be used on each node?
There are two reasons:

1. This allows for the project configuration on the Primary node to be
independent of the Secondary. If you didn’t configure them identical, then
once you configured the primary DSN you would have to go to the
secondary and configure it’s DSN in the project. With the configuration in
one place it avoids this duplication of effort.
NOTE:
Remember that you aren’t actually allowed to make project configuration
changes on the secondary.
2. The Primary will actually talk to the Secondary’s Database (and vice
versa), but only by a utility called “Datamerge”, and not by the core logging
processes.
29
Server Redundancy – DataMerge Utility
What does the Datamerge Utility do?
The Datamerge utility is used to synchronize the Databases between the Primary
and the Secondary servers. When one of the Servers in the redundant pair is
stopped or goes down, and then restarts, it will have a gap in it’s SQL Database
for the time period it was down. The Datamerge utility will review the time period
that the Server was down and attempt to migrate the missing data to/from the
local database to the other Server’s database.
NOTE:
The only way the utility can know where both databases are is if the ODBC
DSN’s are configured identically on both of them.
Network
Primary Secondary
Master Slave
ONLINE ONLINE
Step 1
Both systems are online and in a master/slave configuration
30
Primary (master) is taken Secondary asserts itself as

offline the new master
Network
Create
File
Primary Secondary
Master Slave
ptnr_<timestamp>.log
OFFLINE ONLINE
Step 2:
a) Primary is taken offline
b) Secondary asserts itself as master and creates the

ptnr_<timestamp>.log file and puts the date/time into the file.
The file is plain text.
31
Primary comes back online. Secondary detects Primary.
Append
Network
File
Primary Secondary
Slave Master
ptnr_<timestamp>.log
ONLINE ONLINE
Step 3
a) Primary comes back online.

b) Secondary detects the Primary is back and adds the date/time to the
ptnr_<timestamp>.log file.
What are the ptnr_*.log files used for?

These files are used by the Datamerge utility. They contain the timeframe for
which the data must be merged between the two databases. The ptnr_*.log files
are generated on both the Primary and Secondary. The node that contains the
data for the timeframe is the local node Database, and the data must be inserted
into the other node’s Database.
32
Primary Database Secondary Database
Datamerge
Network Read File
ptnr_011520071200.log
Primary Secondary
Master Slave
Step 1
The datamerge utility launches and reads the timeframe from all of the
ptnr_*.log files.
33
Query
Primary Database Secondary Database
Data
Insert Data Datamerge
Network
ptnr_011520071200.log
Primary Secondary
Master Slave
Step 2
The Datamerge utility queries the data from the local Database and inserts
the results into the remote Database for the time specified in the ptnr_*.log
file(s).
34
Server Redundancy Pitfalls
35
Server Redundancy Pitfalls
Do not exceed 40% steady state CPU Utilization

If you exceed 40% CPU usage, then the system runs the risk of not replying to
the ping’s between the primary and the secondary within an acceptable
timeframe. This can cause a “spontaneous failover”.
Do not break the network connection between the Master and

Slave
This causes Dual Master mode to occur on the systems. It is difficult to recover
as both nodes think that they are actually the master in the redundant pair.
Never start both the Primary and Secondary at the exact same
time
This is an issue, as the pings may not be properly processed until the project is
completely started. This can cause the two servers to fight for control (fail over
back and forth) because they both believe the other isn’t running. Start the
master, let it finish, and then start the slave.
36

Proficy HMI - SCADA Cerver Redundancy

Uploaded by

Copyright:

Available Formats

Proficy HMI - SCADA Cerver Redundancy

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Proficy HMI - SCADA Cerver Redundancy

Uploaded by

Copyright:

Available Formats

Proficy HMI/SCADA – Cimplicity

Server Redundancy Guide

What is Server Redundancy…………………………………………………. 3

This all happens without user interaction.

CIMPLICITY Server CIMPLICITY Server

Server Redundancy is a “Hot Standby” of a currently active

CIMPLICITY Server CIMPLICITY Server

Moderate configurations contain at least one Viewer on the network

A complicated layout would be considered to be one that has multiple

What is required for Server Redundancy?

• The Primary must be installed/licensed as Server Development.

When you enable Server Redundancy in a project what happens?

The Workbench imports and adds a class called “GefRedundancy” and

How is it configured in the Workbench?

• On the Secondary in the Redundant pair you create a Windows

When you do a configuration update on the Primary’s Workbench, the following

Master Alarm Master User Slave User Slave Alarm

Master Point Slave Point

Master Derived Slave Derived

How do the Servers detect that a failure has occurred?

What port do the Servers ping on?

How long does it take between a failure and it being detected

So the formula to detect a failover time is:

By default this is:

= 5 seconds * (3 retries + 1) = 20 seconds.

For larger applications this value must be tuned.

How are Device Communications Updates dealt with?

When using Device Communications drivers in a Redundant Server

Remember that the Point Management processes are communicating to each

What happens to the Device Communications updates on the

Master Device Slave Device

Master Point Slave Point

A close up view of Device Communications behavior in Server Redundancy

Master Point Slave Point

1. Both Device Communications drivers poll the PLC

2. The Master Device Communications driver processes the update and

4. The Slave Device Communication Process passes it’s update to the

When using the Event Manager in a Server Redundant configuration it is

The following things should be considered/kept in mind, when designing a

2 Master Point Slave Point

Master Event Slave Event

Router (W32RTR) Cache

Viewer PTM CIMPROJ Primary 6.1.5383

Router (W32RTR) Cache

Viewer PTM CIMPROJ Primary 6.1.5383

Router (W32RTR) Cache

CIMPROJ Secondary 6.1.5383

Router (W32RTR) Cache

CIMPROJ Secondary 6.1.5383

Redundant Servers are still able to do Viewer to Server communication.

• The only broadcast mechanism is UDP Broadcasts (standard

How does the Project Broadcast mechanism work?

The broadcast contains the following:

Rules of Server Redundant DataLogging:

1. Logging is done in parallel between the Primary and the Secondary

The Primary only logs to the “Master” ODBC DSN’s.

Master Alarm Master User Slave User Slave Alarm

Master Point Slave Point