h3c s12500 high availability and high reliability technology white paper

Upload: amine-bouaz

Post on 09-Jan-2016

216 views

Category:

Documents


0 download

DESCRIPTION

Reliability Technology White Paper

TRANSCRIPT

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 1 of 20

    H3C S12500 High Availability and High Reliability Technology White Paper

    Key words: HA, high reliability

    Abstract: As a mandatory feature of a carrier-class device, high availability (HA) enables high reliability. If

    any fault occurs, HA feature helps rapidly and properly restore system running, thus lessening

    mean time to repair (MTTR). Network HA refers to maximizing system running time on the basis of

    limited cost, thus minimizing the losses resulting from service interruption caused by a fault.

    Acronyms:

    Acronym Full Spelling

    BFD Bidirectional Forwarding Detection

    DLDP Device Link Detection Protocol

    FIB Forwarding Information Base

    FRR Fast ReRoute

    GR Graceful Restart

    HA High Availability

    LACP Link Aggregation Control Protocol

    MPLS Multiprotocol Label Switching

    MTBF Mean Time Between Failures

    MTTR Mean Time To Repair

    NSF None Stop Forwarding

    OAA Open Application Architecture

    OAM Operations Administration and Maintenance

    OAP Open Application Platform

    RPR Resilient Packet Ring

    RRPP Rapid Ring Protection Protocol

    VPN Virtual Private Network

    VRRP Virtual Router Redundancy Protocol

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 2 of 20

    Table of Contents

    Overview3 Background 3 Benefits 3

    Implementation of Reliability Technology of System Architecture4 Introduction to Four Boards and Four Planes in One Device 4 Reliability of Control Plane5

    Advantages of Dual MPUs 5 Running Mechanism of Dual MPUs 5 Introduction to Hot Backup 5

    Reliability of Forwarding Plane 6 Reliability of Detection Plane 7 Reliability of Support Plane7

    Reliability Technology of Power Supply 7 Reliability Technology of Fan 8 Hot Swapping Technology 9

    HA of Services 9 Introduction to NSF Technology 9 High Reliability Technology of Link 11

    Link Aggregation Technology11 RRPP Technology 11 Smart Link Technology12 DLDP Technology 13

    High Reliability Technology of Network 15 VRRP15 Equivalent Route 17 BFD17 IP FRR18 MPLS TE FRR19

    HA of Software Maintenance (Software Hot Patch on Line)20

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 3 of 20

    Overview

    Background As a mandatory feature of a carrier-class device, HA feature helps rapidly and properly restore system running in the case of a fault, thus lessening MTTR and increasing mean time between failures (MTBF).

    As the amount of data traffic increases as well as the requirement for QoS, HA increasingly becomes one of the most important features of a high-performance network. Network HA refers to maximizing system running time on the basis of limited cost, thus minimizing the losses resulting from service interruption caused by a fault. A network featuring HA should reduce faults of hardware and software and back up important resources. Once detecting that a fault may occur, the system can rapidly transfer the tasks to be affected to the backup resources to continue service provisioning.

    Benefits Built upon a non-blocking Clos switching architecture with higher performance and better scalability, H3Cs new-generation S12500 series routing switches are free from a single-point failure in architecture, hardware, software, and protocol, reaching carrier-class reliability requirement, 99.999%.

    The S12500 series switches support diversified HA and high reliability features:

    System architecture reliability

    z Four boards and four planes in one device: Four boards refer to the main processing unit (MPU), switching fabric module, line processing unit (LPU), and open application architecture (OAA) board. Four planes refer to the support plane, detection plane, control plane, and forwarding plane. The control plane is physically separate from the forwarding plane.

    z Reliability of control plane: The device supports dual MPUs for redundancy and rapid active/standby switchover.

    z Reliability of forwarding plane: The advanced forwarding architecture of Clos non-blocking switch fabric supports multiple independent switching network boards and N+M redundant backup of switching capacity.

    z Reliability of detection plane: The device supports redundant backup of dual detection planes which are separate from each other. The detection plane rapidly detects various network protocols and switches over services.

    z Reliability of support plane: The device supports redundant backup of power supply, fan, storage device, and clock. It also supports detection control of the power supply system and fan system. It reports an alarm if any failure is detected. Furthermore, all boards support hot swapping.

    HA of services

    z Non-stop forwarding technology: NSF, GR z Link high-reliability technology: DLDP, Smart Link, RRPP, link aggregation z Network high-reliability technology: VRRP, ECMP, dynamic routing fast convergence, BFD for

    VRRP/BGP/IS-IS/OSPF/RSVP, TE FRR/IP FRR

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 4 of 20

    HA of software maintenance

    z Supporting hot patch z Supporting online upgrade

    Implementation of Reliability Technology of System

    Architecture

    Introduction to Four Boards and Four Planes in One Device Figure 1 System architecture

    As shown in Figure 1, an S12500 switch is built upon a carrier-class high-reliability architecture with four boards and four planes on it.

    z The control plane consists of the CPU system of the MPU, the CPU system of LPUs, and management channels on the backplane. As the control core part of the switch, the control plane carries out protocol operation, routing table maintenance, device management, and operation, administration & maintenance.

    z Consisting of switching units, forwarding engines, and data channels on the backplane, the forwarding plane implements service processing and data forwarding, including L2 forwarding in Ethernet, ACL/QoS, IP forwarding, MPLS VPN and multicast.

    z Consisting of the CPU of MPUs, dedicated OAM detection engine and OAM channels, the detection plane implements fast detection of various network protocols and fast service

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 5 of 20

    switchover, for example, RRPP, BFD for BGP/IS-IS/OSPF/RSVP/VPLS PW/VRRP. It can achieve 50 ms service switchover.

    z Consisting of the control system and the control channel of each module, the support plane implements such functions as detection and control of power supply system and fan system, and reporting alarms.

    The four planes are independent from each other without affecting each other. The system reliability reaches 99.999%.

    Reliability of Control Plane The S12500 series switches support active and standby MPUs, which is an important embodiment of high reliability technologies. As the core of the control plane, the active MPU communicates with the external for service implementation and implements normal functions of each module, while the standby MPU is the backup of the active MPU without communication with the external for service implementation. When the active MPU fails, the switch automatically implements active/standby switchover. Then, the standby MPU takes over tasks of the active MPU to guarantee normal service running.

    Advantages of Dual MPUs

    Compared with a single MPU, dual MPUs show much better convergence. In the case of dual MPUs, the standby MPU will first load the image file and initialize configuration. During active/standby switchover, the LPUs do not need to be re-registered, and L2 and L3 interfaces will not be up/down. In addition, the standby MPU has backed up forwarding entries so it can undertake the forwarding task immediately to avoid service interruption.

    Running Mechanism of Dual MPUs

    In the case of dual MPUs, the status of the MPU is determined by the hardware upon startup. Generally, the device selects the MPU in the slot with a smaller slot number as an active MPU, while the hardware sets a delay for the MPU in the slot with a larger slot number so that it is started later.

    Upon initial startup, two MPUs are in standby state and start software respectively. In general, the MPU in the slot with a smaller slot number is an active MPU. If the active MPU fails, the MPU in the slot with a larger slot number becomes an active MPU. If the device is restarted, the status of the MPU in the slot with a larger slot number is active prior to restart. After the restart, the MPU in the slot with a smaller slot number is still an active MPU.

    Introduction to Hot Backup

    Hot backup process

    Hot backup of active and standby MPUs is divided into three stages: batch backup, real-time backup and smooth data migration.

    z After the standby MPU starts, the active MPU synchronizes the standby MPU to store current backup data on all modules, due to great difference between the active MPU and the standby MPU. This process is called batch backup.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 6 of 20

    z After batch backup ends, the system backs up data in real time, wherein the changes in the backup data on the active MPU are also embodied on the standby MPU.

    z After active/standby switchover, the standby MPU becomes an active MPU to notify each module of collecting data from LPUs and synchronizing data on LPUs. This process is called smooth data migration.

    After receiving data in real time, the standby MPU is upgraded to be an active MPU upon detecting a switchover notification. The detection notification is triggered by interruption and hardware switchover time for active/standby switchover is accurate to ms. After hardware is switched over, the status machine of the new active MPU is in smooth state to smoothly process data.

    Triggers of active/standby switchover

    Triggers of active/standby switchover are classified into the following:

    z Run the active/standby switchover command in the command line to forcedly carry out switchover.

    z A function of the active MPU fails. z The active MPU is hard reset or removed manually, which causes active/standby switchover. z On the active MPU, software is improperly restarted, which triggers switchover. For example, a

    module takes up CPU for excessively long time, which causes the watch dog of hardware to initiate restart; improper data access and command access lead to restart.

    No matter which triggers switchover, the standby MPU is triggered by hardware interruption and status switchover time is accurate to millisecond.

    Reliability of Forwarding Plane The forwarding plane of the S12500 series switches is not integrated on the MPU, which is different from previous switches. It is installed on an independent switching fabric module architecture instead. As a result, the control plane is physically separated from the forwarding plane, thus reducing the impact of the control plane on the forwarding plane and freeing data switching from the influence of MPU switchover.

    Built upon an innovative Clos network architecture, the S12500 series switches support non-blocking switching. In the S12500 series switches, LPUs form the first and third levels of a Clos network. The switching network card on the switching fabric module is the medium level, each of which contains one or more data planes.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 7 of 20

    Figure 2 Clos interactive forwarding architecture

    The biggest feature of the S12500s forwarding plane is N+M redundant backup of the switching fabric module. In other words, the number of switching fabric modules can be greater than the exchange capacity required by current forwarding. When a switching fabric module fails, the link hardware on the backplane can detect and avoid it. In other words, data forwarding between LPUs is forwarded to other network boards instead of the faulty link, thus reducing the influence of hardware failures of switching fabric modules on services.

    Reliability of Detection Plane The S12500 series switches support an independent plane on which an independent OAM engine is installed for rapid detection of BFD and RRPP without conflicting with other services, thus guaranteeing reliability of rapid detection. Whats more important is that the OAM engine of the S12500 series switches is located on the MPU, which greatly enhances performance of rapid active/standby switchover.

    Additionally, the S12500 series switches support online service detection. When the forwarding function of the S12500 fails, the detection plane reports an alarm, which alerts users to isolate the faulty part in advance.

    Reliability of Support Plane

    Reliability Technology of Power Supply

    Because a power supply is the basis for guaranteeing devices running, it is necessary to configure redundant power supplies. A high-end device requires higher reliability. To guarantee stable power supply input, industry players always employ 1+1 redundant backup. Using a multiple-module power supply, the S12500 supports X+N redundant backup. Users can run a command to set desired redundancy. Furthermore, MBUS is used to control boards to be powered on in order to avoid influence of concurrent power-on of components at an instant on the system.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 8 of 20

    The S12500 series switches support independent power supply management and monitoring of power supply status of boards, including voltage and current monitoring, and temperature monitoring of POLA module. Users can run a query command to check system load and details on the power supply module at any time.

    Assuming that the number of actual in-position power supply modules is N, the effects arising from configuration changes are as listed in Table 1:

    Table 1 Configuration changes and corresponding effects

    Configuration change Effects

    No configuration When there is no alarm, the system can normally provide supply when one power module fails.

    The number of redundant modules is 0.

    The actual power supply capability is unchanged, but more line card boards can be inserted without causing an alarm.

    From zero configuration to n (n>1) redundant modules

    The redundant capability is enhanced.

    When system load exceeds the power supply capability of N-n power modules, an alarm is generated.

    If there is no alarm, the system can normally provide supply when n power modules fail.

    Insert one power module. Actual power supply capability is enhanced without changing the redundancy capability. More line card boards can be inserted without causing an alarm.

    Remove one power module, but system load is within the redundancy capability.

    Actual power supply capability is degraded without changing the redundancy capability.

    Remove one power module and system load is beyond the redundancy capability.

    Actual power supply capability is degraded as well as the redundancy capability, thus causing an alarm.

    One power module fails. An alarm is generated and actual power supply capability is degraded.

    Insert one LPU and system load is beyond the redundancy capability.

    An alarm is generated.

    Disable power supply management.

    No alarm is generated. When system load exceeds power supply capability, the whole frame may be powered off.

    Reliability Technology of Fan

    As an important heat dissipation means, a fan is directly linked to the stable operation of the device. Once the fan is faulty and heat cannot be dissipated in time, the interior of the device will create high temperature and excessive heat, which may result in the burning of chips and boards. Fan redundancy is also very important. The S12500 series switches provide two fan trays (in 1+1 redundant backup) so that fan trays can be changed on line without affecting normal running of the device.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 9 of 20

    The S12500 series switches support the detection of ambient temperature and fan speed adjustment based on the device temperature, thus saving energy and guaranteeing timely heat dissipation of boards. When temperature of one board is excessively high, the support plane reports an alarm and powers it off in time.

    Moreover, the S12500 series switches support alarm generation. When the hardware of a fan fails, the S12500 reports an alarm to notify users of replacing the faulty hardware.

    Hot Swapping Technology

    Hot swapping technology refers to the direct inserting or removing of the component or board during the operation of the device, which will not affect the other components or the services borne on the boards. Hot swapping function includes:

    z Insert or remove a board in or from a frame without affecting the board which is being used. z Replace a board on line. In other words, when users remove a board and insert another board

    (or insert the removed board again), the new board can inherit the original configuration without affecting running of other boards.

    All components in the S12500 series switches support hot swapping, including MPUs, power modules, fans, and various LPUs. With the hot swapping function, users can maintain and update components, deploy more services, and provide more functions, without interrupting services.

    HA of Services

    Introduction to NSF Technology Figure 3 NSF schematic diagram

    Active MPU

    Control

    Standby MPU

    Original protocol sessions are switched over.

    Protocol sessions are kept.

    Control

    Switching plane Control Control

    FIBFIBFIBFIB

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 10 of 20

    As an important HA technology on the service plane, NSF ensures non-stop data forwarding when the control plane of the switch fails, for example, fault-triggered restart or routing oscillation, thus preventing various streams of the network from being impacted. To support NSF, a device should meet the following two requirements:

    z The device should adopt the distributed architecture, with data forwarding separate from control, and support dual MPUs. When an active/standby switchover takes place, the standby MPU must save IP/MPLS forwarding entries (forwarding plane) successfully.

    z Status (control plane) of some protocols can be saved.

    For OSPF, IS-IS, BGP, LDP and other complicated protocols, it costs a lot or it is impossible to completely back up complicated status of the control plane. In contrast, by partly backing up some protocol status (or not backing up protocol status) and the help of adjacent devices, session connections on the control panel are not reset in the case of active/standby switchover so that forwarding is not interrupted.

    Figure 4 GR schematic diagram

    The technology for not resetting the control plane is called graceful restart (GR) of routing protocols, which shows that forwarding is not interrupted when routing protocols are restarted.

    The core of the GR mechanism is when the routing protocol of a device is restarted, it informs adjacent devices of keeping the neighbor relationship and routes to the device stable for a certain period. After the routing protocol is completely restarted, the adjacent devices help the device to synchronize the routing information and restore the routing information of the device to the state before the restart within the shortest time. During the entire protocol restart, network routes and forwarding are kept highly stable. The packet forwarding path is not changed in any way. The whole system can forward IP packets continually.

    The S12500 series switches support GR for OSFP/BGP/IS-IS/LDP/RSVP. When the active/standby switchover of MPUs takes place, the peer device continues keeping the protocol neighbor relationship with the local device, thus avoiding network oscillation and guaranteeing network stability.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 11 of 20

    High Reliability Technology of Link

    Link Aggregation Technology

    Link aggregation technology is also called trunking or bonding technology. The essence of the technology is that a number of physical links between two devices are combined into a logical data channel, called an aggregated link, as shown in Figure 5. Two physical links between switches form an aggregated link. Logically the link is a totality. The internal composition and details of data transmission are transparent to the upper-level services.

    Figure 5 Schematic diagram of link aggregation

    The physical links within the aggregation jointly complete the tasks of data transmitting/receiving, and provide backup to each other. As long as the aggregation has a normal member, the whole transmission link will not fail. As shown in Figure 6, if Link 1 fails, data tasks of Link 1 are rapidly transferred to Link 2 and data streams between two switches are not interrupted.

    Figure 6 Mutual backup of link aggregation members

    The S12500 series switches support manual aggregation, dynamic aggregation, intra-board aggregation, and inter-board aggregation.

    RRPP Technology

    Dedicated to Ethernet rings, RRPP is a link layer protocol, which avoids broadcast storm caused by data loops on a complete Ethernet ring. When one link on the Ethernet ring is cut off, RRPP rapidly restores communication path between nodes on the ring network. Most MANs and enterprise networks are ring networks to guarantee high reliability. However, failure of any node on the ring may affect services.

    Currently, STP and RRPP are common technologies used to solve loop problems on L2 networks. STP applications are relatively mature, but convergence is accurate to second. As a link layer protocol dedicated to Ethernet loops, RRPP supports faster convergence than STP. In addition, convergence time supported by RRPP is irrelative to the number of nodes on the ring network. RRPP can be applied to the networks with long diameters.

    The S12500 supports RRPP multiple instances and establishment of multiple RRPP networks, thus meeting the flexibility requirements of networking.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 12 of 20

    Figure 7 RRPP networking

    Device A Device B

    Device CDevice D

    Master node

    Transit node

    Domain 1

    Ring 1

    Transit node

    Transit node

    Polling mechanism is that for master nodes on the RRPP ring to actively detect health of the ring network.

    The master node regularly sends Hello packets from its master port, which are transmitted on the ring by passing each transmit node in turn. If the loop is healthy, the secondary port on the master node receives the Hello packets before the timer expires and the master node keeps the blocking status of the slave port. If a loop is cut off, the secondary port on the primary node cannot receive Hello packets before the timer expires, the master node removes the blocking status of data VLAN on the secondary port and sends a Common-Flush-FDB packet to notify all transmit nodes of updating their own MAC entries and ARP/ND entries.

    When discovering one of their ports in the RRPP domain is down, a transmit node, an edge node, or an auxiliary edge node sends a Link-Down packet immediately to the master node. After receiving the Link-Down packet, the master node removes the blocking status of the data VLAN on its secondary port and sends the Common-Flush-FDB packet to notify all transmit nodes, edge nodes, and auxiliary edge nodes of updating their own MAC entries and ARP/ND entries. After each node updates its own entry, data streams are switched over to normal links.

    In addition, RRPP can be configured on an aggregation group and link reliability is guaranteed by aggregation and RRPP.

    Smart Link Technology

    A Smart Link group is also called a flexible link group. Each Smart Link group contains two ports only. One is an active port and the other is a standby port. Normally, only one port is in active state, while the other port is blocked, in standby state. When the link of the active port fails, the Smart Link group automatically blocks this port and the standby port is switched over to be an active port. For example, the port is down and OAM unidirectional link occurs. In addition, Smart Link can be configured on an aggregation group and link reliability is guaranteed by aggregation and Smart Link.

    Smart link meets the requirement for rapid link convergence and also backs up active/standby links for redundancy and rapidly migrates active/standby links. In the networking with two uplinks, when the

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 13 of 20

    active link fails, the device automatically switches over traffic to the standby link, thus backing up links for redundancy. The main characteristics are as follows:

    z Dedicated to two uplinks z Rapid convergence (accurate to sub-second) z Simple configuration, which facilitates user operation

    When a Smart Link is switched over, MAC address forwarding entry and ARP/ND entry on each device on the network may not be the latest. To properly send packets, a mechanism for updating MAC address forwarding entries and ARP/ND entries should be provided. Currently, two update mechanisms are available:

    z Automatically update MAC address forwarding entries and ARP/ND entries by traffic. This mode is applicable to interconnection to the devices (including the devices of other vendors) that do not support Smart Link. It should be triggered by upstream traffic.

    z A Smart Link device sends Flush packets from a new link. This mode requires upstream devices to identify Flush packets on the Smart Link and update MAC address forwarding entries and ARP/ND entries.

    When the original active link is failed over, the port is still in standby state without link status switchover, thus keeping traffic stable. This port is switched to be active only after next link switchover.

    Smart Link supports multiple instances. In different Smart Link instances, one port can assume different roles. For example, in instance 1, a port is an active port, while in instance 2, the port is a standby port. In this case, traffic load of different instances can be balanced between ports.

    DLDP Technology

    A special phenomenon occurs during actual networking, namely, unidirectional link. Unidirectional link means that the local end can receive the packets sent by the peer end over the link layer, but the peer end cannot receive the packets sent by the local end. Unidirectional link leads to a series of problems, such as loop in spanning tree topology.

    Take fiber as an example. A unidirectional link is classified into two cases: fibers are cross-connected; one fiber is disconnected or one fiber is cut off. As shown in Figure 8, crossed fibers refer to the fibers which are connected reversely. As shown in Figure 9, hollow lines indicate that one fiber is not connected or one fiber is cut off.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 14 of 20

    Figure 8 Cross-connect of fibers

    Device AGE2/0/1 GE2/0/2

    Device BGE2/0/1 GE2/0/2

    PC

    Figure 9 One disconnected fiber or one broken fiber

    Device AGE2/0/1 GE2/0/2

    Device BGE2/0/1 GE2/0/2

    PC

    DLDP can monitor link status of fibers or copper twisted pairs. If a unidirectional link exists, DLDP automatically disables related ports or notifies users of manually disabling them according to user configuration to prevent network problems.

    DLDP is a link layer protocol, which is used together with the protocols at the physical layer to monitor link status of devices. The automatic negotiation mechanism at the physical layer detects physical signals and faults. DLDP identifies peer devices and a unidirectional link, and disables an unreachable port. DLDP and the automatic negotiation mechanism at the physical layer work together to detect and disable physical and logical unidirectional connection. If the links at both local end and remote end can work properly at the physical layer, DLDP detects whether these links are properly connected and whether two ends can properly exchange packets at the link layer. This detection cannot be implemented by the automatic negotiation mechanism.

    DLDP has the following two working modes:

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 15 of 20

    z Common mode: In this mode, once the aging timer of a neighbor expires, one Advertisement packet with RSY label is sent concurrently when the neighbor entry is deleted.

    z Enhanced mode: In this mode, once the aging timer of a neighbor expires, the enhanced timer is started. Every one second, one Probe packet is sent to actively detect the neighbor. Eight Probe packets are continuously sent. If an Echo packet from the neighbor is not received when the Echo wait timer expires, the device is disabled.

    The enhanced mode aims to detect a network black hole to prevent that one end is up and another end is down. When a port is set to forced rate and forced full duplex mode, some devices may experience the cases as shown in Figure 10. In fact, port B has been down, but a common link layer protocol cannot detect that it has been down. As a result, port A is still up. In the enhanced mode, port A initiates a probe after the aging timer of a neighbor expires. If port A does not receive Echo packet of port B after Echo wait timer expires, port A is disabled. Because physical status of port B is down, DLDP status of port B is inactive.

    Figure 10 Enhanced DLDP mode

    z In common mode of DLDP, the system can identify one type of unidirectional link only: cross-connected fibers.

    z In enhanced mode of DLDP, the system can identify two types of unidirectional links. One is cross-connected fibers, and the other is one disconnected fiber or one broken fiber. When detecting the unidirectional link of the latter type, a port should be set to work in forced rate and forced full duplex modes. Otherwise, even if DLDP is enabled, DLDP is invalid. When the unidirectional link of the latter type occurs, the port that has optical signal at the receive end is disabled, while the port that does not have optical signal at the receive end is inactive.

    High Reliability Technology of Network Network reliability technologies include the following:

    z VRRP z Equivalent route z BFD z FRR, including IP FRR, MPLS TE FRR

    VRRP

    As shown in Figure 11, a default route with the gateway as the next hop is set for all the hosts in the same network segment. The packet sent by the host to other network segments is sent to the gateway through the default route, and then forwarded by the gateway, thus completing communication between the host and the external network. When the gateway is faulty, all hosts within the network segment with the gateway as the default route cannot communicate with the external network.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 16 of 20

    Figure 11 Common LAN networking

    The default route provides convenience to configuration operation of users. However, it poses a high requirement on stability of the default gateway. Adding more egress gateways is a common method to improve system reliability. In that case, the way to select a route among multiple egresses becomes a pending issue.

    VRRP is an error tolerance protocol, which solves the preceding problems well by the separation between physical devices and logical devices. On a LAN with multicast or broadcast capability, for example, Ethernet, highly reliable default link can still be provided with the help of VRRP when a device is faulty. Network interruption caused by the fault of a single link is avoided. There is no need to modify the dynamic routing protocol, routing discovery protocol and other configuration.

    According to VRRP, a group of routers within the LAN are put together as a VRRP group. The VRRP group consists of a master router and multiple backup routers, which is equivalent to a virtual router regarding functions. Figure 12 shows the typical networking of VRRP.

    Figure 12 VRRP typical networking scheme

    Host A

    Host B

    Host C

    Switch A

    Switch B

    Switch C

    Virtual router

    Network

    The S12500 series switches support VRRP. The S12500 series switches support 256 VRRP groups and BFD for VRRP high-speed detection to achieve 30 ms fault checking and 50 ms service switchover.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 17 of 20

    Equivalent Route

    Figure 13 Equivalent route implementation mechanism

    The S12500 series switches support equal-cost multi-path routing (ECMP). Each route supports eight equivalent paths for load balancing of IP or MPLS traffic and also supports Hash load balancing by driving traffic. ECMP minimizes occurrence of disordered packets. After path switchover, traffic is rapidly switched over to other active links, thus guaranteeing service reliability.

    BFD

    BFD is a networkwide unified detection mechanism for fast detecting and monitoring the connectivity of network links or IP route forwarding. To improve the existing network performance, the adjacent protocols should be able to fast detect a communication fault, thus quickly establishing a backup channel to restore communication.

    z BFD: Defined by the IETF, BFD rapidly detects faults of nodes and links. By default, the handshake time is 10 ms. BFD enables detection with light load and short duration. BFD can detect any medium and any protocol layer in real time. The detection time and overhead range are wide.

    z BFD can detect faults on any type of channel between systems, including direct physical link, tunnel, MPLS LSP, multi-hop routing channel and indirect channel.

    z BFD detection results can be applied to IGP fast convergence and FRR. z BFD protocol has been accepted and recognized by the industry and deployed widely.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 18 of 20

    Figure 14 BFD implementation mechanism

    The S12500 series switches fully support BFD for VRRP/BGP/IS-IS/OSPF/RSVP/VPLS PW/static routing. On the basis of the dual planes (control plane and forwarding plane) of the traditional core switch, the switches adopt the unique design of the detection plane. The plane monitors network faults. It helps to implement 30 ms fault detection and 50 ms service switchover, ensuring that services are not interrupted. The detection plane and the control plane & forwarding plane are independent from each other and will not affect each other. They provide carrier-class equipment reliability and network reliability to users. A test shows that BFD switchover time of the S12500 is shorter than 50 ms.

    IP FRR

    The interruption of traffic transmission caused by a link or node fault on the network is restored only when the route is re-converged on the new topology. During the time interval between interruption and restoration, the packets that can reach the destination by penetrating the faulty part will be lost or undergo a loop. The route convergence process consists of the following aspects:

    1) Fault detection time 2) Re-propagation time of the routing information (including the generation and propagation time of

    LSA/LSP) 3) Route calculation time (including the time for LSDB route calculation after the change) 4) Route delivery time (including inter-board synchronization of FIB entries and time for delivery to

    the driver)

    Currently, a number of new technologies are used in fast convergence of routing protocols. For example, shorten fault detection time by BFD, lessen the time for re-propagating routing information by Fast Flood, and decrease the time for route calculation by ISPF and PRC. As a result, route convergence is greatly quickened. Currently, in the case of 10,000 routes, traffic interruption time caused by a network fault can be within one second.

    However, voice, video and other new network services pose more stringent requirements on the traffic interruption time. A large number of carriers hope to control the traffic interruption time caused by network faults within 50 ms or less. This requirement cannot be satisfied by the traditional routing protocol fast convergence technologies.

    At present, the new method that is being researched for meeting such a requirement is to calculate backup route in advance. In other words, when detecting a fault, a router does not disseminate route

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 19 of 20

    information or calculate a route at once. Instead, the router replaces the failed route with the backup route to directly rectify the fault locally. During the process when the whole new route completes re-convergence, the pre-determined backup route is used for forwarding. In this case, traffic interruption time which is equal to the sum of the time for detecting an adjacent fault and the time for replacing the failed route with a backup route is greatly shortened. The new technology of using local preset repair path to provide protection for the failed link or router is called IP FRR.

    Figure 15 IP FRR

    Switch A Fault

    Switch C

    Switch ESwitch DSwitch B

    The basic principle for IP FRR is as shown in Figure 15. Normally, the routing table of Switch B indicates that the packets with the destination of Switch E should be forwarded by Switch D. In addition, a backup path is added to the routing table of Switch B, that is, the packets with the destination of Switch E can be forwarded by Switch C. When detecting a link fault between Switch B and Switch D, Switch B forwards the packets with the destination of Switch E to the backup next-hop Switch C.

    MPLS TE FRR

    MPLS TE FRR is a mechanism for link protection and node protection. When an LSP link or a node is faulty, protection is performed at the faulty node to allow traffic to go through the tunnel of the protected link or node. In that way, data transmission is not to be interrupted. At the same time, the head node can continue to initiate re-establishment of the main path without affecting data transmission.

    The basic principle of MPLS TE FRR is to use a pre-established LSP to protect one or more LSPs. The pre-established LSP is called FRR LSP and the protected LSP is called main LSP. The ultimate goal of MPLS TE FRR is to use the bypass tunnel to bypass the faulty link or node, thus protecting the main path.

    MPLS TE FRR is implemented on the basis of RSVP TE. There are two modes for implementing FRR;

    z Detour: One-to-one Backup. Each protected LSP is protected, and a protection path is created for each protected LSP. The protected path is called Detour LSP.

    z Bypass: Facility Backup. One protection path is used to protect multiple LSPs. The protected path is called Bypass LSP.

    The Detour mode provides protection to each LSP, requiring a relatively large amount of overheads. In fact, Bypass mode is more widely used than Detour mode.

  • H3C S12500 High Availability and High Reliability Technology White Paper

    Hangzhou H3C Technologies Co., Ltd. Page 20 of 20

    Figure 16 MPLS TE FRR

    The S12500 series switches support MPLS TE FRR. It ensures 50 ms carrier-class switchover performance by the BFD for RSVP fast detection mechanism.

    HA of Software Maintenance (Software Hot Patch on

    Line) Figure 17 Online patch

    The S12500 series switches can modify software bugs or add a small scale of new features on line without restarting the devices. They provide commands for controlling status switchover of patch units so that users can easily load, activate, deactivate, run and delete a patch unit.

    Copyright 2009 Hangzhou H3C Technologies Co., Ltd. All Rights Reserved.

    Any company and individual shall not excerpt and/or duplicate the content of this document, in part or in full, without the written consent of

    H3C, and shall not propagate the content in any form.

    This document is subject to change without prior notice.

    OverviewBackgroundBenefitsSystem architecture reliabilityHA of servicesHA of software maintenance

    Implementation of Reliability Technology of System ArchitectureIntroduction to Four Boards and Four Planes in One DeviceReliability of Control PlaneAdvantages of Dual MPUsRunning Mechanism of Dual MPUsIntroduction to Hot BackupHot backup processTriggers of active/standby switchover

    Reliability of Forwarding PlaneReliability of Detection PlaneReliability of Support PlaneReliability Technology of Power SupplyReliability Technology of FanHot Swapping Technology

    HA of ServicesIntroduction to NSF TechnologyHigh Reliability Technology of LinkLink Aggregation TechnologyRRPP TechnologySmart Link TechnologyDLDP Technology

    High Reliability Technology of NetworkVRRPEquivalent RouteBFDIP FRRMPLS TE FRR

    HA of Software Maintenance (Software Hot Patch on Line)