Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Proficy HMI - SCADA Cerver Redundancy

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

 

 
 
 
 
 
 
 
 
 

Proficy HMI/SCADA – Cimplicity

Server Redundancy Guide


Server Redundancy Outline

What is Server Redundancy…………………………………………………. 3


Basic Configurations…………………………………………………………. 4
Server Redundancy Device Communications…………………………… 14
Event Manager…………………………………………………………………. 17
How Broadcast works to Viewers………………………………………….. 19
How the Datalogging Functions……………………………………………. 26
The DataMerge Utility………………………………………………………… 30
Things to Avoid……………………………………………………………….. 35

  2  
Server Redundancy Outline

Server Redundancy is the ability of having two servers acting together to run a
single project. They work in a Master/Slave pair with one node always acting as
the current hot standby to the other. When a failure is detected on the current
active master, then the slave asserts itself as the new master, and all client
applications (i.e. Viewers) transition their requests to the new master (with some
limitations).

This all happens without user interaction.

  3  
Server Redundancy Simple Layout

Network

CIMPLICITY Server CIMPLICITY Server


Primary Secondary

PLC

Server Redundancy is a “Hot Standby” of a currently active


CIMPLICTY Server. It is comprised of two servers (a Primary and a
Secondary system). In it’s simplest form there are two Servers with
single connections to PLC’s.

  4  
Server Redundancy Moderate Layout

Viewer

Network

CIMPLICITY Server CIMPLICITY Server


Primary Secondary

PLC

Moderate configurations contain at least one Viewer on the network


that the Redundant Servers are communicating to.

  5  
Server Redundancy Complex Layout

Viewer PLC

Network A

Network B
CIMPLICITY Server CIMPLICITY Server
Primary Secondary

A complicated layout would be considered to be one that has multiple


viewers, multiple PLC’s, and is using Cable redundancy for Server
Redundancy, as well as PLC Cable Redundancy.

  6  
Server Redundancy Configuration Requirements

What is required for Server Redundancy?

• The Primary must be installed/licensed as Server Development.


• The Secondary must have at least a Server Runtime license with
the same point count as the license on the Primary.
• Both nodes require the “Server Redundancy” option license
• The systems should be similar in hardware specifications and
limitations.
• Both Servers will require their own database (MSDE or MS SQL)
• Servers must be on the same network without a router between
them. This is for Viewer to Server Communications.
• Both nodes must be the exact same Version and Service Pack.
• On versions prior to Proficy HMI/SCADA – CIMPLICITY 7.0 or
greater, Basic Control Engine (BCE) must be Enabled as a project
option.

When you enable Server Redundancy in a project what happens?

The Workbench imports and adds a class called “GefRedundancy” and


creates an instance of this class as an object called “Redundancy”. This object
is visible from the workbench and has the fields filled in with the Primary and the
Secondary computer names. In addition, this class has some scripts that are
added to the Event Editor by the class (hence why the BCE is a required option).

NOTE:
Prior to CIMPLICITY 5.0 this class/object did not exist. When upgrading a project
from a version prior to CIMPLICITY 5.0 or greater, it must be opened in the
Workbench, the Server Redundancy option must be removed, and then the
Server Redundancy option must be added.
This will add the new GefRedundancy class and Redundancy object.

  7  
Server Redundancy Configuration

How is it configured in the Workbench?

• On the Secondary in the Redundant pair you create a Windows


network share. This is the location where the project will reside on
the secondary.
• On the Primary in the workbench you enable Server Redundancy
and point it to the network share on the Secondary using a mapped
network drive.

NOTE:
As of Proficy HMI/SCADA – CIMPLICITY 7.0 or greater, you are allowed
to use a UNC path (i.e. \\computer\share) instead of a mapped network drive (i.e.
Z:\)

  8  
Server Redundancy Configuration Update

Primary Secondary
(Local) (Network Share)
Project Project
alarm_help alarm_help
arc arc
data data
lock lock
Copy/Config Update
log log
master
screens screens
scripts scripts
Project.gef Project.gef

When you do a configuration update on the Primary’s Workbench, the following


occurs:

• The folders for the project (with the exception of the /master folder)
are copied to the Secondary Network share.
• The files in the /data folder are modified so that the system knows
that this is a Secondary in a redundant pair (i.e. no modifications
allowed).
• The configuration update is done under the security context of the
Windows User who is using the Workbench on the Primary (relates
to Network Permissions).

  9  
Server Redundancy Functionality

  10  
Server Redundancy Core Processes

Primary Secondary

Logical Link

Master Alarm Master User Slave User Slave Alarm


Manager Registration
Logical Link Registration Manager

Master Point Slave Point


Manager
Logical Link Manager

Master Derived Slave Derived


Point Process Point Process

In order to achieve Server Redundancy CIMPLICITY has to make sure that all of
the core processes between the Primary and the Secondary are synchronized.

This means that they must have the same data and must be in the same state
with regards to that data. This is achieved by setting up Logical Links (i.e. direct
connections) between core processes on the nodes.

For each of these core processes there is a master process and slave process.

  11  
Server Redundancy - The RtrPing Process

How do the Servers detect that a failure has occurred?

The Primary and Secondary are constantly checking to make sure that the other
node is still there. They do this by using a process called “rtrping”. This is a
process that’s sole responsibility is to actively ping, and respond to pings, to/from
the other node in the redundant pair. If a ping does not come back in the
appropriate timeframe then an automatic failover is initiated.

What port do the Servers ping on?

They ping each other on TCP/IP port 4000 (by default). It is configurable to
change which port they ping each other on by changing the
REDUND_PROBE_PORT global parameter. There is further information in the
help files on this.

Primary Secondary

W32RTR W32RTR

Ping
RTRPing RTRPing
TCP/IP Port 4000 TCP/IP Port 4000

Ping

The RtrPing processes ping each other periodically to confirm that the other node
is still there and is still active

  12  
Server Redundancy Failover Time

How long does it take between a failure and it being detected


and failover being initiated?

The RtrPing utility will ping the other node every 5 seconds (by default). This time
is configurable by the REDUND_PROBE_INTERVAL global parameter. RtrPing
will wait for the time to be exceeded and then will go through the retries. The
number of retries is 1 more than the REDUND_PROBE_COUNT global
parameter (which has a default value of 3).

So the formula to detect a failover time is:

= REDUND_PROBE_INTERVAL * (REDUND_PROBE_COUNT + 1)

By default this is:

= 5 seconds * (3 retries + 1) = 20 seconds.

For larger applications this value must be tuned.

  13  
Server Redundancy Device Communications

How are Device Communications Updates dealt with?

When using Device Communications drivers in a Redundant Server


configuration, only the updates from the master Device Communications driver
are actually used.

Remember that the Point Management processes are communicating to each


other via Logical Link. To keep duplicate updates from coming in from the Device
Communications processes (which are talking to the same PLC on both nodes),
only the updates that come in from the master node are allowed to propagate to
the PTM process – and thereby across to the slave PTM process.

What happens to the Device Communications updates on the


Slave?

The Device Communications driver is still actively polling the PLC on the slave,
but it’s updates are simply ignored by the Slave Point Management process.

NOTE:
In general this is true, but there are exceptions to this for Unsolicited
communications and OPC Client.

  14  
Server Redundancy - Device Communications Overview

Poll Poll
PLC

Master Device Slave Device


Communications Communications

Update Update

Master Point Slave Point


Manager
Logical Link Manager

Primary Secondary

A close up view of Device Communications behavior in Server Redundancy

  15  
Server Redundancy - Device Communications Detail

Poll Poll
PLC

1
Slave Device
Master Device
2 Communications
Communications

Update Update
4

Master Point Slave Point


3 Manager
Logical Link Manager

Primary Secondary

1. Both Device Communications drivers poll the PLC

2. The Master Device Communications driver processes the update and


passes it to the Master Point Management (PTM) Process via the
Device Communications Queue (DCQ).

3. The Master PTM Process receives the update and sends it out on the
wire to the slave PTM process.

4. The Slave Device Communication Process passes it’s update to the


Slave PTM process, but it is ignored.

5. The Slave PTM process receives the update from the Master PTM
Process and uses this update to update its client processes.

  16  
Server Redundancy - Event Manager Behavior

When using the Event Manager in a Server Redundant configuration it is


imperative to program around the limitations and understand the behavior of it in
this context.

The following things should be considered/kept in mind, when designing a


Redundant Server application that uses the Event Manager.

• Event Manager runs on both the master and the slave at the same
time.
• Setpoint requests are always ignored on the Slave EMRP/PTM
process
• Scripts on the Slave execute just like they would on the Master
node.
• A check should be put into the scripts to validate whether or not the
script is executing on the master. If not, then ideally it should
terminate, or do something innocuous. This is entirely at the
discretion of the user’s programming.

NOTE:
There is a function called “CimIsMaster()” that can be used in scripting to
detect if it is executing on the current master.

  17  
Server Redundancy - Event Manager Detail

Poll Poll
PLC

1
Slave Device
Master Device
Communications
Communications

2 Master Point Slave Point


Manager
Logical Link Manager

Master Event Slave Event


3 Manager Manager 4

Primary Secondary

1. Both Device Communications drivers poll the PLC. The Master Device
Communications driver processes the update and passes it to the Master
Point Management (PTM) Process via the Device Communications Queue
(DCQ).

2. The Master PTM Process receives the update and sends it out on the wire
to the Slave PTM process.

3. The Master EMRP process runs whatever is configured in it. Set points
are written to the Master PTM process.

4. The Slave EMRP process runs (just like the master) and executes.
However, all set point requests are ignored.

  18  
Server Redundancy - Viewer Communication

  19  
Server Redundancy - Viewer Communication

UDP Broadcast
Router (W32RTR) Cache
Project Computer Version

Master
UDP Broadcast

Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer

Slave

Node: Secondary
Project: CIMPROJ
Version: 6.1.5383

Step 1
A Redundant Server pair starts up. Immediately the Master starts to
broadcast its information onto the network via a UDP Broadcast (port
32000).

  20  
Server Redundancy - Viewer Communication

Router (W32RTR) Cache


Project Computer Version

Viewer PTM CIMPROJ Primary 6.1.5383


Communications
Master

Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer

Slave

Node: Secondary
Project: CIMPROJ
Version: 6.1.5383

Step 2
The Viewer updates its cache with the relevant data and directs its PTM
communications to the active master.

  21  
Server Redundancy - Viewer Communication

Router (W32RTR) Cache


Project Computer Version

Viewer PTM CIMPROJ Primary 6.1.5383


Communications
Slave

Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer

Master

Node: Secondary
Project: CIMPROJ
Version: 6.1.5383

Step 3
The Secondary transitions to be the master due to a failover

  22  
Server Redundancy - Viewer Communication

Router (W32RTR) Cache


Project Computer Version

CIMPROJ Secondary 6.1.5383

Slave

Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
UDP Broadcast Viewer

Master

Node: Secondary
Project: CIMPROJ
Version: 6.1.5383

Step 4
The Secondary starts to broadcast his UDP broadcast, with the
updated computer name.

  23  
Server Redundancy - Viewer Communication

Router (W32RTR) Cache


Project Computer Version

CIMPROJ Secondary 6.1.5383

Slave

Node: Primary `
Project: CIMPROJ
Version: 6.1.5383
Viewer
Viewer PTM
Master
Communications

Node: Secondary
Project: CIMPROJ
Version: 6.1.5383

Step 5
The Viewer detects the change and redirects its PTM communications to
the new active master.

  24  
Server Redundancy - Viewer Communication Limitations

Redundant Servers are still able to do Viewer to Server communication.


However, there are a few limitations.

• The only broadcast mechanism is UDP Broadcasts (standard


Project Broadcast)
• Multicast is not supported (will not function)
• Viewer-to-Server by Nodename can function, but has severe
restrictions. These restrictions include the inability to automatically
fail over to the active master when the Redundant pair failovers.
This also requires additional configuration.

NOTE:
This is NOT recommended and should only be considered for special situations.

How does the Project Broadcast mechanism work?


When a project first starts on the master node, it immediately sends out a Project
Broadcast (UDP port 32000) on the network.

The broadcast contains the following:


• Project Name
• Computer Name
• CIMPLICITY Version

This broadcast will go out once every 90 seconds by default, unless a Viewer
requests it sooner. When a viewer receives the broadcast it will connect to both
of the servers in the redundant pair via a TCP/IP connection and directs it’s PTM
requests to the current active master. When a failover occurs, the new master
immediately starts to broadcast and the slave will halt broadcasting. The Viewers
local PTM process then directs it’s activities to the new active Master.

  25  
Server Redundancy - Database Logging

  26  
Server Redundancy - Database Logger

The Data logging capability of a Server Redundant pair is generally not very well
understood. This is mainly due to the fact that the Data logger prompts you for all
of the DSN’s to be used on the primary and the secondary.

Rules of Server Redundant DataLogging:

1. Logging is done in parallel between the Primary and the Secondary


(regardless of who is the current active master).
2. The Primary only logs to the “Master” ODBC DSN’s. The Secondary only
logs to the “Slave” ODBC DSN’s. In this context they are a misnomer.
3. The ODBC Data sources “master” and “slave” must be identical between
the Primary and Secondary. (i.e the “master” DSN must point to the same
Database regardless of what node the DSN is configured on).
4. The DSN’s on the Primary and Secondary must have the exact same
names.

The Primary only logs to the “Master” ODBC DSN’s.


The Secondary only logs to the “Slave” ODBC DSN’s

  27  
Server Redundancy - Database Logging

Primary Secondary

Logical Link

Master Alarm Master User Slave User Slave Alarm


Manager Registration
Logical Link Registration Manager

Master Point Slave Point


Manager
Logical Link Manager

Master DL Master PTDL Slave PTDL Slave DL

Master Master Slave Slave


Alarm Logging Point Logging Point Logging Alarm Logging

Server Redundancy Data Logging is done in PARALLEL and is done regardless


of who is master or slave.

  28  
Server Redundancy - Database Logger

Why configure the DSN’s on both nodes if only one of them is ever going to
actually be used on each node?

There are two reasons:


1. This allows for the project configuration on the Primary node to be
independent of the Secondary. If you didn’t configure them identical, then
once you configured the primary DSN you would have to go to the
secondary and configure it’s DSN in the project. With the configuration in
one place it avoids this duplication of effort.

NOTE:
Remember that you aren’t actually allowed to make project configuration
changes on the secondary.

2. The Primary will actually talk to the Secondary’s Database (and vice
versa), but only by a utility called “Datamerge”, and not by the core logging
processes.

  29  
Server Redundancy – DataMerge Utility

What does the Datamerge Utility do?

The Datamerge utility is used to synchronize the Databases between the Primary
and the Secondary servers. When one of the Servers in the redundant pair is
stopped or goes down, and then restarts, it will have a gap in it’s SQL Database
for the time period it was down. The Datamerge utility will review the time period
that the Server was down and attempt to migrate the missing data to/from the
local database to the other Server’s database.

NOTE:
The only way the utility can know where both databases are is if the ODBC
DSN’s are configured identically on both of them.

Network

Primary Secondary
Master Slave
ONLINE ONLINE

Step 1

Both systems are online and in a master/slave configuration

  30  
Server Redundancy – DataMerge Utility

Primary (master) is taken Secondary asserts itself as


offline the new master

Network

Create
File
Primary Secondary
Master Slave
ptnr_<timestamp>.log
OFFLINE ONLINE

Step 2:

a) Primary is taken offline

b) Secondary asserts itself as master and creates the


ptnr_<timestamp>.log file and puts the date/time into the file.
The file is plain text.

  31  
Server Redundancy – DataMerge Utility

Primary comes back online. Secondary detects Primary.

Append
Network

File
Primary Secondary
Slave Master
ptnr_<timestamp>.log
ONLINE ONLINE

Step 3

a) Primary comes back online.


b) Secondary detects the Primary is back and adds the date/time to the
ptnr_<timestamp>.log file.

What are the ptnr_*.log files used for?


These files are used by the Datamerge utility. They contain the timeframe for
which the data must be merged between the two databases. The ptnr_*.log files
are generated on both the Primary and Secondary. The node that contains the
data for the timeframe is the local node Database, and the data must be inserted
into the other node’s Database.

  32  
Server Redundancy – DataMerge Utility

Primary Database Secondary Database

Datamerge

Network Read File

ptnr_011520071200.log
Primary Secondary
Master Slave

Step 1
The datamerge utility launches and reads the timeframe from all of the
ptnr_*.log files.

  33  
Server Redundancy – DataMerge Utility

Query
Primary Database Secondary Database
Data

Insert Data Datamerge

Network

ptnr_011520071200.log
Primary Secondary
Master Slave

Step 2
The Datamerge utility queries the data from the local Database and inserts
the results into the remote Database for the time specified in the ptnr_*.log
file(s).

  34  
Server Redundancy Pitfalls

  35  
Server Redundancy Pitfalls

Do not exceed 40% steady state CPU Utilization


If you exceed 40% CPU usage, then the system runs the risk of not replying to
the ping’s between the primary and the secondary within an acceptable
timeframe. This can cause a “spontaneous failover”.

Do not break the network connection between the Master and


Slave
This causes Dual Master mode to occur on the systems. It is difficult to recover
as both nodes think that they are actually the master in the redundant pair.

Never start both the Primary and Secondary at the exact same
time

This is an issue, as the pings may not be properly processed until the project is
completely started. This can cause the two servers to fight for control (fail over
back and forth) because they both believe the other isn’t running. Start the
master, let it finish, and then start the slave.

  36  

You might also like