FHRP - who is your daddy and what does he do?
Sun, Jun 5 2011 03:31
| cisco, fhrp, juniper
Before we go into First Hop Redundancy Protocols (FHRPs) lets just for a moment consider the challenge. In any internetwork you’ll have domains (AS) and segments (IP networks). Each domain and segment will be connected to the others ‘hopefully’ with multiple paths of such extreme cunning and wit that the loss of one, two or even three paths is necessary to bring your internetwork topology to it’s knees. Internetworking is a good thing and we’ve got loads and loads of protocols to help us get it right in terms of redundancy. IGP’s like RIP, EIGRP, OSPF and IS-IS all do a great job of providing our routers with preferential paths and load balancing etc.
For the local segment however your hosts (pc's etc) will usually only have one predefined (static) way in and out, most often just the default gateway. So what happens when your LAN gateway goes down? Well frankly you and all the hosts in your network are trapped. Unlike your routed network (running with multiple point-to-point links) your LAN segment will be supporting hundreds if not thousands of devices, losing the gateway on a LAN segment is bad news if you are a user but the administrator's day is apocalyptic.
So what can we do about providing some resilience to our local LAN users?
One of the more forgotten FHRP options is IRDP or the ICMP Router Discovery Protocol. The RFC1256 describes how ICMP 'Hello' packets are sent out by the routers into the local networks. End hosts 'listen' to these ICMP packets to dynamically learn the gateway address and install it into their routing tables. Clients can poll out using multicast to find new routers to get out should their ICMP acquired host go away. It’s pretty neat to be honest but it very rare to see in the wild these days. Thats not to say you should forget it however, it’s on the Cisco CCIE blueprint for sure.
So more realistic options? Well what about maybe loading up a dynamic routing protocol on the workstation itself? Most common open protocols such as RIP and OSPF are widely supported in PC operating systems with Windows, Linux etc. If we fire up RIP on PC and get it to talk to the routers in our network then of course if one of those goes then we should be able to build up a new route. This feels pretty good right? Well maybe, but it's going to be tough setting this up on a large scale. Also, with my security hat on, I can't say I'm over the moon allowing my hosts to affect my routing tables...what if someone redistributed one of my core networks so it looked like we should send traffic to their workstation instead of the finance database network...bad! We’ve got options around filters and authentication to add some security but again, complexity...feels OK but maybe not...lets move on.
Lets get dirty again now (or desperate) and bring in dynamic host configuration protocols like BOOTP/DHCP. If we were smart maybe we’d bring down the lease period for this sort of thing. If we get a gateway failure we can reconfigure the scope handing out IP addresses, netmask and the local gateway device. Change the gateway IP address to another unit capable taking the traffic which was going to the now broken gateway, we’re golden. If waiting for a lease to expire is too long (most default to 3 days) then some members of the team could always ask all staff to reboot their machines or do a quick DHCP refresh. If this was an organisation with 1000's of customers off the air I'm pretty sure this would be your last choice..or last task in employment. Either way I hope you’re getting the point here. This sort of ‘fix’ is unworkable.
One of the better redundancy choices for the local LAN is actually one of the simplest and oldest. Let's digress a little here. Remember ARP? We use the Address Resolution Protocol in our local LAN to get the remote hosts MAC address when we only know the IP address. Now enter Proxy-ARP. Proxy ARP is enabled by default on Cisco routers and, as the name implies, proxies ARP requests for hosts beyond the physical local network made by the host. Imagine a network 10.0.0.0/8 and hosts within our network have the IP address 10.1.100.1 to 10.1.100.254. Our local segment router has the IP address 10.1.100.1. When the host 10.1.100.10 wishes to talk to host 10.1.1.10 then it simply ARPs out for it (it is on the same network after all as far as it is concerned). So the router10.1.100.1 takes the ARP request and 'pretends' to be the remote host by sending the host 10.1.100.10 it's own MAC address as the destination MAC for 10.1.1.10. Pretty neat. The issue with proxy arp though is that if router 10.1.100.1 goes down and we have another router (in the same broadcast domain) then we have to wait for the ARP table of the host to go stale before it will ARP out again for that remote host. The ARP table will refresh in a few minutes if we are trying to talk to it. This time period again, is probably too long to be considered a solution even though it will eventually work and it’s way faster than DHCP). To speed it up we can of manually clear the ARP table on the host (again we’re using man power here which isn’t exactly reliable or efficient) or send a gratuitous ARP out on the local LAN for those remote addresses (unworkable really).
Right, we’ve been through the pain, here is the treatment.
Enter Cisco with HSRP. Defined in RFC 2281, HSRP allows two or more (up to three) devices in the same network segment to share an IP address and MAC address. One of the 'HSRP group' will be the active router and will be advertising ownership of the virtual addresses and receiving traffic. When that active router fails it can have other routers with connectivity to the same sources to take over the traffic. HSRP is automatic, it’s pain-free, it’s wonderful. Cisco have gone on to develop HSRP further to watch availability not only to the local interfaces but also remote connectivity such as lost routes int the table, high packet loss and much more. By making sure that the healthy gateway device has genuine reliable connectivity to its upstream we’re making intelligent decisions right at the source.
Now, as we’ve seen before, the Cisco proprietary protocol developed and engineered by Cisco is often picked up by the industry, reworked and brought out as a standard. In this case HSRP was reworked into VRRP. Network devices other than Cisco run VRRP to provide this same service. The two protocols are almost identical and indeed on a Cisco host which supports both HSRP and VRRP is exceptional in its ambiguity. Configuration of VRRP uses the ‘vrrp’ keyword whereas HSRP uses the ‘standby’ keyword. Apart from a few other tweaks there is little to differentiate the two. VRRP is HSRP for the masses. It works.
The final FHRP on our agenda today is GLBP. This protocol operates to provide a virtual IP address in the same way as HSRP and VRRP however where GLBP differs is in the way it works the plan. GLBP is, like HSRP, a collection of 2 or more devices which offer a redundancy to failure of each other. Unlike HSRP and VRRP however GLBP also offers load balancing between these members. HSPR and VRRP each work on the master/slave or active/backup approach where only one member of the group can be actively passing traffic at any one time. GLBP works differently by electing an AVG (Active Virtual Gateway) member which in turn decides from its pool members how to distribute the load. The other members int eh GLBP ‘cluster’ are called the AVF or Active Virtual Forwarders. In exactly the same way as HSRP and VRRP elect the ‘master’ of their respective membership the highest priority GLBP member becomes the AVG. If the AVG fails then that is passed to the next highest member and so on (if all members have the same priority then the one with the highest IP is elected).
This ends our brief look into the dark but essential world of the FHRP.
Thank you for reading.