Book Chapter

WAN optimization techniques

Throttling bandwidth helps manage link utilization

Treating symptoms rather than causes is a common but all too often unsuccessful solution for many problems, including bandwidth consumption. Results may appear positive at first, but these effects are generally short-lived, unreliable or unpredictable, and do not scale well. Bandwidth throttling reflects this approach by seeking to treat the symptoms (slow transfers or excessive response times) rather than the causes (low bandwidth availability, cyclical processing peaks, unauthorized applications, inappropriate resource links, and so forth). Nevertheless, this technique has yet to fall into disuse or total disrepute.

Bandwidth throttling ensures that bandwidth-intensive devices such as routers or gateways limit the quantities of data transmitted and received over some specific period of time. Bandwidth throttling limits network congestion and reduces server instability resulting from network saturation. For ISPs, bandwidth throttling restricts user speeds across certain applications or during peak usage periods. But without understanding the causes of traffic spikes that require links to be throttled, it's always just a matter of time before the load comes back again, ready to be throttled again.

Data-bearing servers operate on a simple principle of supply and demand: clients make requests, and servers respond to them. But Internet-facing servers that service client requests are especially prone to overload during peak operating hours and under heavy, intense network loads. Such peak load periods create data congestion or bottlenecking across the connection that can cause server instability and eventual system failure, resulting in downtime. Bandwidth throttling is used as a preventive method to control the server's response level to any surges in client requests throughout peak hours of the day.

In February of 2008, members of the Federal Communications Commission announced they might consider establishing regulations to discourage Internet providers from selectively throttling bandwidth from sites and services that would otherwise consume large amounts. In late 2007, Comcast actively interfered with some of its high-speed Internet subscribers using file-sharing clients and protocols by throttling such connections during peak hours (and only for uploads). This sparked a controversy that continues to this day.

Organizations can (and should) use bandwidth throttling or firewall filters to limit or block traffic that explicitly violates Acceptable Use Policy. But otherwise, bandwidth-throttling is best applied in the form of Class of Service or Quality of Service (CoS/QoS) markers applied to various types or specific instances of network traffic. CoS and QoS represent classification schemes for network traffic that give priority to time-sensitive and mission-critical traffic rather than by limiting a specific type of traffic explicitly. Many experts recommend that unauthorized or unwanted protocols be throttled to extremely low levels of bandwidth (under 10 Kbps) rather than blocked completely, so as to give network administrators an opportunity to ferret out and deal with users or programs involved. Thus, for example, by limiting bandwidth available to peer-to-peer protocols such as BitTorrent (used for video and other personal media downloads) or FastTrack (the Kazaa protocol) to only 5K or 10K bits per second, administrators may have time to identify the workstations or servers acting as endpoints for related peer-to-peer activities, and identify the individuals involved in their use. They can then counsel or discipline users as per prevailing acceptable use policies (AUP).

Fixing WAN-unfriendly LAN protocols

As we discussed initially in Chapter 1, many existing protocols that support vital internal and external business functions operate with limited scope and scalability. These are almost always LAN-based legacy protocols that can impose a serious drag upon wide-scale WAN. For example, the Common Internet Files System, aka CIFS, grows exponentially unwieldy in transit across WAN linkages. Likewise, real-time streaming and voice protocols often introduce needs for traffic shaping. In this practice, controls are applied to network traffic so as to optimize performance, meet performance guarantees, or increase usable bandwidth, usually by delaying packets that may be categorized as relatively delay insensitive or that meet certain classification criteria that mark such traffic as low priority. Traffic shaping is often employed to provide users with satisfactory voice and video services, while still enabling "networking as usual" to proceed for other protocols.

In the same vein, encryption and security protocol acceleration tools become resource-intensive but utterly necessary burdens, especially when sensitive traffic must traverse Internet links. Even the most widely used protocol on the Internet -- namely, HTTP -- may be described as both chatty (involving frequent communications) and bursty (involving numerous periods during which tens to hundreds of resource requests may be in flight on the network at any given moment). The protocol trace shown in Figure 3.3 indicates that the display of a single Web page, involves back-and-forth exchange of information about a great many elements over a short period of time (12 showing on the sample trace, with more out of sight below).

Figure 3.3: A single Web page fetch can spawn tens to hundreds of HTTP "Get" requests and associated data-bearing replies. (Click image to view larger.)

Fixing these "broken" aspects of the network environment becomes a traffic engineering proposition that takes into account not just the applications themselves but application programming in general. Knowing how an application operates, its protocol formats and parameters, and observable run-time behaviors is crucial to understanding how it fits with other applications, services, and protocols on the network. It's not just a patchwork proposition that involves mending individual parts, but instead requires accommodating best practices for efficient WAN communications: send and receive infrequently, in bulk, and in the form of complete transactions whenever possible.

The TCP format was originally designed and engineered to operate reliably over unreliable transmission media irrespective of transmission rates, inherent delays, protocol corruption, data duplication, and segment reordering. Because of this, TCP is indeed a robust and reliable mechanism for delivery of applications, services, and protocols. But this design strength also exposes inherent weakness in TCP delivery when deployed across modern, higher-speed media that completely exceed the conditions under which TCP was originally intended to be used.

Re-engineering these and other "broken" network protocols occurs in WAN optimization solutions, usually through some form of proxy. Such a proxy permits unfettered protocol behavior so that protocols can behave normally and unhindered on the LAN. But the same proxy also translates and typically repackages LAN-oriented transmissions to reduce or eliminate "chattiness" across WAN links, while also batching up individual transmissions to limit the number and maximize the payloads for such WAN transmissions as do occur. This approach maximizes use of available bandwidth when transferring request and reply traffic across WAN links.

IP blindly sends packets without checking on their arrival; TCP maintains ongoing end-to-end connections throughout setup and tear-down phases, and even requires periodic acknowledgements for receipt of data. Unacknowledged data triggers an exponential back-off algorithm that times out and retries transmissions until they're received and acknowledged, or times out to signal connection failure. Sliding TCP window sizes -- these denote the number of packets that can be sent before receipt of an acknowledgement is required -- directly influences performance where larger values equal greater throughput (but also, much longer potential delays). TCP employs a well-defined "slow start" algorithm that initiates communications with a small window size, then scales TCP window sizes to optimal proportions as connections are established and maintained while they remain active. Each of these and other such procedures of the TCP/IP stack introduce network delay addressed in WAN optimization solutions through connection optimization techniques and aggressive windowing methods.


For an outstanding discussion on TCP window size, the slow start algorithm, and other TCP congestion management techniques, please consult Charles Kozierok's excellent book The TCP/IP Guide. This book is available in its entirely online; the section on TCP Reliability and Flow Control Features and Protocol Modifications includes detailed discussion of TCP window management, window size adjustment, congestion handling, and congestion avoidance mechanisms.

Use of compression to reduce bandwidth

As a result of their ability to reduce traffic volume, various performance-improving compression schemes have been an integral part of Internetwork communications since way back when X.25 and Bulletin Board Systems (BBSs) used the ZMODEM file transfer protocol introduced in 1986. Over twenty years later, compression regularly appears in more modern network-oriented protocols like Secure Shell (SSH) and byte-compressing data stream caching in contemporary WAN optimization products.

In telecommunication terms, bandwidth compression means: a reduction of the bandwidth needed to transmit a given amount of data in a given time; or a reduction in the time needed to transmit a given amount of data within a given amount of available bandwidth. This implies a reduction in normal bandwidth (or time) for information-bearing signals without reducing the information content, thanks to proper use of data compression techniques. These are well and good in the WAN environment, but can be ineffective without access to accelerated compression hardware to achieve maximum compression and decompression with minimum time delays (though software is fast nowadays, hardware is usually several orders of magnitude faster -- an essential characteristic, given modern WAN link speeds). WAN optimization devices generally include symbol and object dictionaries to reduce data volumes even more than compression can provide alone, and are discussed later in this chapter.

Recurring redundancy, which describes the ways in which patterns and elements tend to repeat in regular traffic streams between pairs of senders and receivers, remains the crucial reason why compression schemes still thrive. From the right analytical perspective, much of the data transiting networks includes unnecessary repetition, which wastes bits and the time and bandwidth necessary for their conveyance. Compressing data by all possible means helps restore balance to the networking order and is just one of several counters against unwanted network delay.

Various types of compression techniques are eminently suitable for network media, but all of them strive to reduce bandwidth consumed during WAN traversal. Header and payload compression techniques utilize pattern-matching algorithms to identify short, frequently recurring byte patterns on the network that are replaced by shorter segments of code to reduce final transmitted sizes. Simplified algorithms identify repeat byte patterns within individual packets where sophisticated forms of compression may analyze patterns across multiple packets and traffic flows.

Any gains that compression strategies provide must vary according to the mix and makeup in WAN traffic. Compressed archives of data (such as ZIP or tar files) cannot be reduced much further using network compression schemes, but applying compression across various flows of traffic can still enhance effective WAN bandwidth. Voice protocols significantly benefit from UDP header compression in conjunction with other techniques such as packet coalescing, described in the following section.

Redundant overlays bring files closer to users

Redundancy isn't always bad for network performance or throughput. There are many applications and instances where multiplicity can enhance performance. In fact, WAN environments benefit greatly from using redundant overlays of file services that bring files and data closer to their end-users.

As the DNS system has shown us, a network can operate effectively in consolidated and decentralized ways at the same time. A single, uniform body of information can be consolidated from many sources and distributed throughout a completely decentralized system for delivery anywhere in the networked world. Consolidating aggregate data in a single location only creates benefits for those users likely to be in close network proximity to the data center, but can pose accessibility issues for other users more than one network hop away, especially those who must use narrow or expensive WAN links to connect to that data.

Resorting to a central authority or source of information for a globally-dispersed company can have predictable negative issues for remote and roaming users, so it's better to replicate information in places where its users can access it quickly, no matter where they might be. DNS databases, with their masters and slaves, and authoritative and non-authoritative versions, along with careful use of caching of recent activity, create a model for widespread, dispersed use of distributed information that works well enough to keep the global Internet humming along. In similar fashion, redundant overlays seek to keep the files that users are most likely to access no more than one or two WAN hops away from their machines, no matter where they might be at any given moment in time.

Today, many companies toil with the challenge of server consolidation and proliferation. Numerous in-house servers service a variety of application services and network protocols, and despite their overwhelming ubiquity they aren't always in the right place to serve at the right time. Many companies opt instead to roll out several key servers in strategically-located installations throughout their geographical and territorial boundaries to better accommodate "away" teams and users. This approach permits a group of carefully synchronized servers that handle multiple basic business needs to deliver comparable accessibility, bandwidth, and security to anyone anywhere. Replication across multiple physical servers makes this approach possible, while virtualization so that individual services run in separate virtual code spaces makes the process more practical and maintenance and monitoring more workable.

Network assets and end-users are often dispersed across branch offices, customer sites, and ISPs that span multiple regions. A well-designed server consolidation strategy necessarily centralizes the infrastructure and reduces server count to save on costs and improve management. Unfortunately, this also effectively places the burden of ensuring seamless connectivity between remote locations directly onto WAN links, and fails to deliver the goods whenever such links fail. This means there must be some mechanism or mechanisms in place to pick up traffic that occurs as the number of remotely-connected users increases.

During opening hours, many businesses endure a surge of initial network traffic that consists largely of multiple users logging in simultaneously to one or more servers. Authentication and DNS directory services access is commonly problematic during this pre-game warm-up routine where everyone shows up to log-in at the same time, so there needs to be some way to optimize and prioritize traffic so wait times at login prompts are minimal. A consolidated WAN optimization solution helps obtain the best use of network architecture because it confers an ability to make the most of the bandwidth a WAN link can carry, while also optimizing priority traffic so that when congestion occurs important data gets through anyway. Thus, the morning user, coffee in one hand, keyboard in the other, might have to wait a minute (literally) to download e-mail, but he or she will still get a quick response when they hit return after supplying an account name and password.



  WAN optimization basics
  WAN optimization techniques
  State-of-the-art acceleration techniques
  WAN technologies summarized

About the author:
Ed Tittel is a 24-year computing industry veteran who has worked as a software developer, systems engineer, trainer, and manager. He has also contributed to more than 100 computer trade books, including several college textbooks, and writes regularly for TechTarget, Tom's Hardware, and Perhaps best known for creating the Exam Cram series of IT certification prep books in 1997, Ed also received the Best Networking Professional Career Achievement Award from the NPA in 2004, and has been a finalist in the "Favorite Study Guide Author" category in the Annual Reader's Choice Awards every year since those awards were launched in 2002.

This was first published in December 2008

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: