Most importantly, you need a solid understanding of your actual redundancy requirement, because your organization's expectation of network availability should be set appropriately. I recommend thinking of it as a target along a continuum, expressed in something like a percentage of uptime (e.g., "We have four nines or five nines availability") rather than a binary "yes/no" statement attached to some design point (e.g., "X network component is or is not redundant"), because you're probably the only person in your organization who knows the implications of your design.
You could design a very simple, non-redundant network that might have a failure every once in a while but would be quick and easy to troubleshoot and return to service. Or you could design a very complex redundancy scheme that would hardly ever experience an outage, even when a couple of devices failed. But if it did go down, it might take a lot of time to figure out the problem, resulting in a longer outage.
So again, the question is, would your organization rather tolerate a few brief outages per year, or one big, ugly one?
To get a little more insight into redundancy, realize that you don't necessarily have to design all network components as "redundant" or "not redundant." Having a mix is usually the best -- but not a random mix, of course.
Start by asking the vendors what the mean time between failure (MTBF) is for each component. You will most likely find that redundant power supplies and hard drives are popular in servers because power supplies and hard drives are the server components that fail most often. With routers and switches, power supplies are also relatively unreliable, but supervisor modules almost never fail. Line cards are somewhere in between. Thus, a single router with redundant power supplies may be a lot more bang for the buck in terms of redundancy than two whole routers. Still, some organizations may not want to risk even the extremely unlikely component failures.
Back to redundant circuits and routers.... First, if your redundancy requirement includes diversity, you must obviously have redundant routers, because your circuits will have to terminate in two physical locations and one router cannot be in both locations. If diversity isn't a requirement, then the next question to answer is whether you expect to use the second circuit in an active/active or active/passive mode. If you get a discount from the WAN provider for backup bandwidth (discounts are common on multi-access networks such as MPLS, frame-relay and ATM but not common if you're buying leased lines), you will have to go active/passive, so it's not a concern. But if you don't get a discount, you probably want to use the bandwidth you're paying for.
If, for instance, you have two T1s, you could have twice the bandwidth by using them at the same time, although it's a little more complex. If the circuits terminate in the same router, they can be bonded (IMA or Multilink PPP) so that a pair of 1.5 Mb/s circuits appears as a 3 Mb/s circuit. If the circuits are in separate routers, then you can still send traffic across each circuit, but any one TCP flow will be unable to exceed 1.544 Mb/s. Thus, with either one or two routers, you can use all 3 Mb/s of bandwidth you've purchased, but performance is a little better for individual users when the circuits terminate on one router.
On the other hand, a failure in this scenario would cause bandwidth to drop suddenly from 3 Mb/s to 1.5 Mb/s. Would your organization consider that an outage? If 1.5 Mb/s isn't acceptable performance, then even though you now have two circuits, you're not really redundant anymore. You must be wary of this because you may need only 1.5 Mb/s today, but a year from now, as traffic grows, this could turn into an exposure without ever making a change in the network.
Next, to the earlier point, you can still purchase a second router and configure it as a cold spare, ready to be placed into service at a moment's notice. This may not be as sophisticated, but it would more than likely meet a well-crafted service level agreement (SLA) for availability.
So, you have many options, but let your requirements guide your solution.
About the author: This was first published in February 2009
Tom Lancaster, CCIE# 8829 CNX# 1105, is a consultant with 15 years of experience in the networking industry. He is co-author of several books on networking, most recently, CCSPTM: Secure PIX and Secure VPN Study Guide, published by Sybex.
This was first published in February 2009