High seamless communication with fault tolerance is one of the key requirements for Ethernet-based, mission-critical and real-time systems such as substation automation systems (SAS), automation networks and other industrial Ethernet networks. The Ethernet standardized by the Institute of Electrical and Electronics Engineers (IEEE) in IEEE 802.3 [
1] is not capable of supporting fault-tolerant networks at all. A basic requirement of every Ethernet network is the avoidance of loops. Every loop causes data frames to circulate forever, thereby flooding the network. For this reason, no alternative paths to any device should exist; only a single way to any end device is allowed at any time. If a redundant path were added to the network, it would cause switching loops and broadcast storm problems in the network, which would consume all available bandwidth, resulting in the interruption of communication throughout the network. Because the standard Ethernet does not support fault tolerance capability [
2], the high availability of Ethernet networks can be increased by using redundancy protocols. Various redundancy protocols for Ethernet have been developed and standardized, such as rapid spanning tree protocol (RSTP) [
3], media redundancy protocol (MRP) [
4], rapid ring recovery (RRR) [
5], time-sensitive network (TSN) [
6,
7], parallel redundancy protocol (PRP) [
8], high-availability seamless redundancy (HSR) [
8] and others. RSTP and MRP provide redundancy in networks. The RSTP can be applied in arbitrary mesh topologies. It implements the distributed computation of a tree based on path costs and priorities. This tree is the active topology that is established by blocking switch ports. In the case of failure, the tree is reconfigured. The MRP approach is restricted to ring topologies. A dedicated node, the ring manager, blocks one of its ring ports to establish a line as the active topology. In the case of failure, this line breaks into two isolated lines that are reconnected by unblocking the previously blocked port. Both the RSTP and the MRP have a switchover delay disadvantage: the RSTP suffers from a recovery time ranging from several hundred milliseconds to 2 s [
9], whereas the MRP’s switchover delay is about a few milliseconds [
10]. RRR is an approach for swift failure detection and recovery in Ethernet ring topologies, which can re-converge after a failure within a few hundred microseconds [
5]. TSN is a set of standards developed by the Time-Sensitive Networking Task Group (IEEE 802.1) [
6]. The TSN is the second generation Audio and Video Bridging (AVB) standards [
11] that are being developed to address the requirements of industrial automation and control networks, and automotive in-vehicle networks. The TSN will enable IEEE 802 Ethernet to be used in industrial applications with stringent end-to-end latency and fault-tolerance requirements, replacing vendor specific real-time solutions in many application areas [
7]. The TSN is currently being developed. PRP and HSR, which provide end-node redundancy, provide seamless redundancy with zero recovery time. In other words, the PRP and HSR protocols are suitable for seamless communications. Both the HSR and the PRP are based upon the principle of providing duplicated frames for separate physical paths with zero recovery time [
8,
12]. But unlike the PRP, which requires dual redundant independent networks, the HSR can be applied to a single network while retaining its property of zero recovery time.
HSR was standardized by the International Electrotechnical Commission (IEC) as IEC62439-3 Clause 5 [
8] and as one of the redundancy protocols selected for substation automation in the IEC61850 standard [
13]. HSR is a redundancy protocol for Ethernet that provides zero recovery time in a case of the failure of one component. HSR is based on the duplication of every frame sent in a ring topology. Each copy of the frame is injected in a different direction of the ring. In the fault-free state of the network, the destination node receives two identical frames, passes the first frame to its upper layers and discards the duplicate. In the case of failure of one component, such as link failure or node failure, only one frame is lost. The application on the destination node operates with the remaining frame undisturbed. Therefore, even in the case of a node or link failure, there is no communication interruption in the network. This feature of the HSR protocol makes it very useful for time-critical and mission-critical applications such as substation automation systems and automation networks. The HSR principles are described and discussed in [
8,
12,
13,
14].
However, HSR is mainly used in ring topologies including single-ring and connected-ring networks. Standard HSR also generates excessively unnecessary redundant traffic frames in HSR networks. This drawback degrades network performance and may cause congestion and delay. Several approaches have been proposed to reduce the unnecessary traffic in HSR networks. Nsaif and Rhee [
15] proposed an approach called quick removing (QR) to prevent the unnecessary traffic from circulating in the rings of HSR networks. Shin and Joe [
16] proposed a traffic control (TC) method to remove duplicated and circulated frames from a ring by selecting one of nodes within the ring as the traffic control node. Abdulsalam and Rhee [
17] proposed a method called port locking (PL) to filter unicast traffic for the DANH rings of HSR networks. Hong and Joe [
18] introduced a packet transmission scheme with different periods based on a single-ring topology to reduce the HSR network-traffic load. Tien and Rhee [
19] proposed a method of filtering HSR traffic (FHT) that filters unicast traffic not only for DANH rings but also for QuadBox rings. FHT also removes duplicated and circulated traffic in the rings. Ngo
et al. [
20] implemented the QR algorithm for HSR components to improve the availability of HSR. Tien, Nsaif and Rhee [
21] proposed an optimal dual paths (ODP) approach that establishes optimal dual paths for each connection pair in an HSR network, based on the network’s topology. The dual paths are then used to forward unicast frames between each connection pair of nodes instead of duplicating and forwarding frames to all parts of the networks.
Unlike the mentioned approaches, which propose algorithms implemented in existing HSR components to improve the network traffic performance in HSR networks, in this paper, we develop a new HSR component, called the HSR switching node (SwitchBox), to apply HSR to any network topology as well as significantly reduce unnecessary redundant network traffic compared to the standard HSR. By using the new SwitchBox nodes, HSR can provide seamless redundancy for HSR networks in any topology and improve the network traffic performance in HSR networks.
The rest of this paper is organized as follows: in
Section 2, we briefly introduce the standard HSR protocol and existing HSR nodes. Next, in
Section 3, we describe the proposed SwitchBox component for HSR. In
Section 4, the traffic performance of HSR with SwitchBox is analyzed, evaluated and compared to that of the standard HSR protocol.
Section 5 describes several simulations and their results to evaluate and validate the performance analysis of HSR with SwitchBox. Finally, we provide our conclusions and suggestions for future work in
Section 6.