How to optimally allocate redundant routers for high availability (HA) networks is a crucial task. In this paper, a 5-tuple availability function A (N, M, λ, μ, δ) is proposed to determine the minimum required number of standby routers to meet the desired availability (ρ) of an HA router, where N and M are the numbers of active routers and standby routers, respectively, and λ, μ, and δ are a single router's failure rate, repair rate, and failure detection and recovery rate, respectively. We have derived the availability function, and analytical results show that the failure detection and recovery rate (δ) is a key parameter for reducing the minimum required number of standby routers of an HA router. Thus, we also propose a High Availability Management (HAM) middleware, which was designed based on an open architecture specification, called OpenAIS, to achieve the goal of reducing takeover delay (1/δ) by stateful backup. We have implemented an HA Open Shortest Path First (HA-OSPF) router, which consists of two active routers and one standby router, to illustrate the proposed HA router. Experimental results show that the takeover delays of the proposed HA-OSPF router were reduced by 6, 37.3, and 98.6% compared with those of the industry standard approaches, the Cisco-ASR 1000 series router, the Juniper MX series router, and the Virtual Router Redundancy Protocol (VRRP) router, respectively. In addition, in contract to the industry routers, the proposed HA router, which was designed based on an open architecture specification, is more cost-effective, and its redundancy model can be more flexibly adjusted.
- continues time Markov chain
- failure detection and recovery rate
- high availability
- redundancy model