This chapter explores design issues related to overall network topology. The following sections discuss the traditional issues of bandwidth, delay, and reliability; as well as the often overlooked issues of operational simplicity and scalability, particularly as they pertain to routing. Specifically, the following issues are discussed:
Requirements and constraints of the networkThis section examines the requirements of a network and the importance of scalability and extensibility. You will also read about constraints on the design effort, including labor, economic, social, time, and space issues; as well as the need to support legacy technologies.
Tools and techniquesYou will explore some of the tools for building large networks. Modularization, layering, multiplexing, and caching are discussed in the context of the overall design. This section briefly examines the use of soft-state mechanisms, hysterisis, and dampening in routing protocols; and finally discusses network failure modes.
Issues of hierarchyThis section demonstrates that hierarchy and redundancy must be carefully balanced to craft a network that can grow to meet future needs without becoming an operational nightmare. Experience gained from the Internet is also discussed. (For additional information on this topic, see Chapter 1, "Evolution of Data Networks.") Finally, this section examines the principles of layering and regionalization of a large network into core, distribution, and access networks.
Backbone network design, as well as distribution, regional network design, and access designIn these three sections, you will examine the details of designing core, distribution, and access networks. The role of each is discussed, and the pros and cons of various approaches are described.
Requirements and Constraints
Before delving into the typical topologies, it is wise to understand the overall network design process. As with any systems design effort, network design is an exercise in meeting new and old requirements while working within certain constraints. These constraints include money, labor, technology, space, and time. In addition, there may be social or political constraints, such as the mandated use of certain standards or vendors.
Economic constraints play a major role in any network design. Unless you are very fortunate, you often must compromise in the capacity of WAN links, the switching capabilities of routers, the type of interfaces used, and the level of redundancy achieved. Achieving the "best possible service at the lowest possible cost" was a design paradigm inventedtongue-in-cheek, to some extentby one network manager to satisfy both management and network users. This paradigm fails to explain how this task is achieved, other than through a carefully considered compromise, but neither does it say anything that is incorrect.
Labor effort should be of paramount concern in any network design. In this case, the first area of concern is the amount of effort and level of skill necessary to connect a new customer to the network or to expand the capacity of the network infrastructure. As a general rule, the more often a task must be executed, the more the design should focus on making that task simple and efficientin other words, the goal involves optimizing the common case. In addition to prudent network design, labor costs can also be reduced through investment in network management tools. It is noteworthy that for many networks, the capital cost is dwarfed by the ongoing charges for highly skilled support personnel.
Processor speed doubles every 18 months. Nevertheless, as you have already seen in Chapter 1, Internet traffic levels can increase at a far more rapid rate. Thus, computation is still a constraint of network design, particularly in the case of routers. Typical computational limitations that apply to network design are associated with processing of routing updates, accounting, security filtering and encryption, address translation, and even packet forwarding.
Space issues include the physically obvious, such as the cost of expensive air-conditioned points of presence (POPs) or co-location facilities. Space also includes subtler, but nonetheless important resources, such as the buffer capacity in a router or the bandwidth of a WAN link.
One time constraint that affects the success of a design is the time-to-market. It is useless to design an extremely sophisticated network if the customers have gone elsewhere by the time it is operational. Time constraints also include packet forwarding and propagation delays, which have a fundamental impact on bandwidth (in a TCP/IP environment) and response time.
Social constraints include those that may not seem sensible to achieve the major requirements of the network. These could include a mandated use of standards that are difficult to obtain, to use, or to understand. Thankfully, this has been less common since the demise of OSI. (At one time in the industry, a play on the OSI reference model included a "political" layer above the application layerthe so-called "eighth layer of networking.") Alternatively, you may be constrained to using a certain vendor's equipment because of a prearranged partnership agreement.
The need to support legacy applications is usually expressed as a requirement, but it generally manifests itself as a serious constraint. Building networks that are backward-compatible with legacy applicationssuch as the need to support protocols such as SNA and DECNETcan be extremely demanding.
Scalability and extensibility are the hallmarks of a good network design. They will haunt or compliment you long after the economic pain is forgotten. This is why network routing is so critical to the design process. Switching and physical-layer technologies may come and go, but the network control plane (of which routing is a major part) must survive many generations of underlying technology.
The control plane is much more difficult to upgrade incrementally than the technologies of the underlying layers, so careful planning pays dividends. In the networking world, those dividends can return in months rather than years.
Many writings on network design emphasize the importance of requirement analysis. Indeed, in terms of the initial delivery of a turnkey, or productized network service, requirement analysis is very important. However, in our experience, nowhere in the industry does the initial-requirements definition document age more quickly than in large-scale networkingparticularly where the Internet is involved.
Too many network designs never leave the ground because of an overly zealous requirements-definition phase. This is an unfortunate side effect of vendors providing "shrink-wrapped" networks to customers.
Valuable engineering cycles spent on extremely detailed requirements or network flow analysis would be better spent ensuring an extensible and scalable network infrastructure, and explaining contingency plans to the customer, if the actual requirements exceed those projected. Unfortunately, stringent requirement analysis seems to be a contractual necessity, so this situation is unlikely to change.
Tools and Techniques
Network design is both an art and a science. The science involves exploiting various methodologies to meet all the requirements within the given constraints. Each of these methods trades one constrained resource for another. The art involves choosing the best balance between constrained resources, resulting in a network that is future-proofone that will grow to meet increased, or even radically new, requirements.
Modularization and Layering
Two of the most common design and implementation methodologies are those of modularization and layering. Both enable the network problem to be broken down into something more manageable, and both involve the definition of interfaces that enable one module or layer to be modified without affecting others. These benefits usually compensate for inefficiency, due to hidden information between layers or modules. Nevertheless, when designing the interfaces between modules or layers, it is good practice to optimize the common case. For example, if there is a large flow of traffic between two distribution networks, perhaps this flow should be optimized by introducing a new dedicated link into the core network.
Layering typically implies a hierarchical relationship. This is a fundamental technique in network protocol design, as exemplified by the ubiquitous OSI reference model. Modularization typically implies a peer relationship, although a hierarchy certainly can exist between modules. In an upcoming section, "Hierarchy Issues," as well as in many of the remaining chapters in this book, the text continues to emphasize and develop the practice of hierarchy and modularization in network design.
Layering the network control plan above a redundant physical infrastructure is a vital part of resilient network design. Critical control information, such as network management or routing updates, should be exchanged using IP addresses of a virtual interface on the router rather than one associated with a physical interface. In Cisco routers, this can be achieved using loopback interfacesvirtual interfaces that are always active, independent of the state of any physical interfaces.
Another common approach used when a physical address must be used for routing is to permit two or more routers to own the same IP address, but not concurrently. A control protocol, such as Cisco's Hot Standby Router Protocol (HSRP), arbitrates the use of the IP address for routing purposes.
Network Design Elements
Multiplexing is a fundamental element of network design. Indeed, you could argue that a network is typically one huge multiplexing system. More specifically, however, multiplexing is a tool that provides economies of scalemultiple users share one large resource rather than a number of individual resources.
NOTE
Multiplexing is the aggregation of multiple independent traffic flows into one large traffic flow. A useful analogy is the freeway system, which multiplexes traffic from many smaller roads into one large flow. At any time, traffic on the freeway may exit onto smaller roads (and thus be de-multiplexed) when it approaches its final destination.
As an added benefit, if the multiplexing is statistical in nature, one user may consume the unused resources of someone else. During periods of congestion, however, this statistical sharing of the resource might need to be predictable to ensure that basic requirements are met. In IP networks, bandwidth is the resource, and routers provide the multiplexing.
Traditionally, multiplexing has been a best-effort process. However, increasingly deterministic behavior is requiredyou can read about such techniques in Chapter 14, "Quality of Service Features." For now, it suffices to say that multiplexing saves money and can provide performance improvements while guaranteeing a minimum level of service.
Randomization is the process of applying random behavior to an otherwise predictable mechanism. This is an important approach to avoid the synchronization of network data or control traffic that can lead to cyclic congestion or instability. Although critical to the design of routing protocols, congestion control, and multiplexing algorithms, randomization is not currently a major factor in network topology design. However, this may change if load sharing of IP traffic through random path selection is ever shown to be a practical routing algorithm.
Soft state is the control of network functions through the use of control messages that are periodically refreshed. If the soft state is not refreshed, it is removed (or timed out). Soft state is also extremely important to routing functions. When routers crash, it becomes difficult to advise other routers that the associated routing information is invalidated. Nearly all routing information is kept as soft-stateif it is not refreshed, or at the very least reconfirmed in some way, it is eventually removed.
Soft state can be obviated by the use of static or "hard-coded" routes that are never invalidated. Static routes should therefore be used with extreme caution.
Some level of hysterisis or dampening is useful whenever there is the possibility of unbounded oscillation. These techniques are often used for processing routing updates in Interior Gateway Protocols (IGPs). If a route is withdrawn, a router may "hold down" that route for several minutes, even if the route is subsequently re-advertised by the IGP. This prevents an unstable route from rapidly oscillating between the used and unused states because the route can change its state only once per hold-down period.
Similarly, the external routing Border Gateway Protocol (BGP) applies dampening to external routes. This prevents CPU saturation that can occur when repeatedly calculating new routes, if large numbers of routes are involved.
Stabilizing routes in this manner also can improve network throughput because the congestion control mechanisms of TCP do not favor environments with oscillating or rapidly changing values of round-trip time or throughput on the network.
Localization and caching represent a variation on the earlier technique of optimizing the common case. Even in today's peer-to-peer networking model, many extremely popular data repositories (such as major Web farms) still exist. By caching commonly accessed Web data (in other words, making a localized copy of this data) it is possible to save long-distance network traffic and improve performance. Such caches can form a natural part of the network hierarchy.
Finally, any network topology should be carefully analyzed during failure of the design's various components. These are usually known as failure modes. The topology should be engineered for graceful degradation. In particular, the failure of links constitutes the most common failure mode, followed by the failure of critical routing nodes.
Topologies
There are essentially four topological building blocks: rings, buses, stars, and meshes. (See Figure 4-1.) A large, well-designed network normally will exploit the benefits of each building blockeither alone or combinedat various points within its architecture.
Figure 4-1 Mesh, Star, Ring, and Bus Topologies (from top)
Although initially attractive due to minimal dependence on complex electronics, the use of bus media, such as repeated Ethernet segments, is decreasing. For the most part, this is due to an increase in the reliability and flexibility of the technology that is implementing rings, stars, and meshes. In particular, bus LAN topologies are typically converted into stars using a LAN switch. This offers increased aggregate bandwidth and superior diagnostic capabilities.
Operational experience has also shown that a passive shared broadcast medium does not necessarily create a more reliable environment because a single misbehaving Ethernet card can render a bus LAN useless for communication purposes.
Hierarchy Issues
Just as the Internet hierarchy had to adapt to accommodate an increase in players, so has the enterprise network. Early data networks facilitated the basic requirement of communication between terminals and a central mainframe. The need for a simple hierarchy, from terminal to IBM cluster controllers to front-end processors, was readily apparent from this basic requirement. In today's world of peer-to-peer networking, however, the reasons for network hierarchy and its inner workings are subtler, yet just as important for successful network design.
Figure 4-2 presents the high-level network architecture developed in Chapter 1. Notice that the network backbone (also called the network corethe terms core and backbone are equivalent) consists of mesh-connected backbone routers that reside at distribution centers (DCs) within each service region.
Each DC, which may house a LAN topology that is resilient to single node failure, forms the hub of a star distribution network for that region. Finally, the access network consists of both provider and customer premise equipment, which is typically homed to one or more access POPs.
Figure 4-2 Modularization of a Large Network
In the example in Figure 4-2, only three regions/DCs and, at most, three POPs in a region are shown. However, the hierarchical model will scale much more than this using current commercial routers.
A typical large IP network, whether an ISP or a large corporate intranet, will consist of routers performing a number of different roles. It is convenient to define three major roles corresponding to each layer in the hierarchy: backbone, distribution, and access.
As shown in Figure 4-3, which illustrates the arrangement in a typical high-resilience DC, these roles possess a specific hierarchical relationship. Backbone routers are at the top, distribution routers are in the middle, and access routers are at the bottom.
Figure 4-3 Distribution Center Architecture
Backbone Routers
The backbone routers core1.sfo and core2.sfo reside in the San Francisco DC and are responsible for connecting the regional network to the backbone. These routers forward packets to and from the region. They also advertise reachability for that region, either to the core routers of other regions (in other major cities), or to external peer networks (other ISPs).
Backbone routers are also peers in terms of the useful reachability information they possess. This does not imply that router core1.sfo has the same detailed topological information about Los Angeles as, say, router core1.lax, but it does indicate that core1.sfo understands that core1.lax and core2.lax, rather than core1.stl, are the gateways to all destinations in the Los Angeles region.
Backbone routers contain reachability intelligence for all destinations within the network. They possess the capability to distinguish between the gateway information and the information that explains how to reach the outside world, which is through other peer networks or the Internet.
Distribution Routers
Distribution routers consolidate connections from access routers. They are often arranged in a configuration that is resilient to failure of a single core router. Distribution routers usually contain topological information about their own region, but they forward packets to a backbone router for inter-region routing.
NOTE
In smaller regions, distribution and backbone routers may be one and the same. In larger regions, distribution routers themselves may form a hierarchy.
High-performance customers on permanent WAN links often may connect directly to distribution routers, whereas dial-on-demand customers typically do not because this would impose the need to run dial-authentication software images of distribution routers.
Access Routers
Access routers connect the customer or enterprise site to the distribution network. In the ISP case, the router at the remote end of an access link is typically the customer premises equipment, and may be owned and operated by the customer.
For large enterprise networks, in which the LANs and WANs are managed by different divisions or contractors, the access router typically is managed by either the WAN or the LAN operatorusually this is the latter if the LAN is very large.
You now may wonder: Why is it important to distinguish between the backbone, access, and distribution routers? The reason is that they are increasingly becoming very distinct hardware/software combinations. In access routers, for example, you already have seen the need for support of dial-on-demand and authentication, as well as route filtering and packet filtering and classification.
In distribution routers, the emphasis is on economical aggregation of traffic and the support of varied media WAN types and protocols. In backbone routers, the emphasis is on supporting extremely high speeds, and aggregation of a very limited set of media types and routing protocols. These differences are summarized in Table 4-1.
Table 4-1 Characteristics of Backbone, Distribution, and Access Routers
|
Router Type |
Characteristics |
|
Backbone router |
Scalable: packet forwarding, WAN links, QoS, routing Expensive Redundant WAN links National infrastructure |
|
Distribution router |
Scalable: WAN aggregation, LAN speeds Redundant LAN links Less expensive |
|
Access router |
Scalable: WAN aggregation Cheap Complex routing/QoS policy setting, access security, and monitoring capabilities |
This discussion focused attention on the WAN environment and has avoided any issues of LAN design, other than the use of specific LAN technology within the distribution or access networks. In particular, at the individual user or network host level, access technologies include ATM, FDDI, Token Ring, or the ubiquitous Ethernet; rather than such technologies as Frame Relay, T1, SMDS, and SONET.
Scaling LANs through the use of hierarchy is itself the subject of much literature. To study this area further, interested readers should refer to the references listed at the end of this chapter.
The origins of the three-tiered, backbone-distribution-access hierarchy can be traced to the evolution of the Internet (refer to Chapter 1). However, hierarchical design is certainly nothing new and has been used in telephone networks and other systems for many years. In the case of IP data networking, there are several reasons for adding hierarchy.
Not only does hierarchy allow the various elements of routing, QoS, accounting, and packet switching to scale; but it also presents the opportunity for operational segmentation of the network, simpler troubleshooting, less complicated individual router configurations, and a logical basis for distance-based packet accounting.
These issues are examined in great depth in Part II of this book, "Core and Distributing Networks." For the moment, we will examine the topologies used within the backbone, distribution, and access layers of the network architecture.
Backbone Core Network Design
In early data networking, the topology for the network backbone was relatively simple: Operations were centralized, so a star topology made the most senseand, in some cases, this was the only topology the technology would support. This did cause the center of the star to become a single point of failure, but because no real traffic flows existed between spokes on the star, this was not a major cause for concern. With the move toward multiple client-server and peer-to-peer relationships, the choice of core network topology is not as clear.
The purpose of the backbone is to connect regional distribution networks and, in some instances, to provide connectivity to other peer networks. A national infrastructure usually forms a significant part of the operational cost of the network. Given its position at the top of the network hierarchy, two requirements of the backbone topology are clear: it must be reliable and it must scale.
Making the Backbone Reliable
Reliability can be acquired by employing two methods. First, you can create more reliable routers through the use of "carrier-class" characteristics, such as multiple CPUs, power supplies, and generators; and even redundant routers. Ultimately, however, any backbone will include WAN links that rely on a great deal of equipment and environmental stability for their operation, which represents a real risk of ultimate failure. If the carrier's up-time guarantees are not sufficient, you have no choice but to design a backbone that is resilient to link failure.
The second option is to simply connect all distribution networks with a full mesh. However, in terms of minimizing hop count within the network, the full mesh approach has several drawbacks:
First, given N regional distribution networks, you must have N(N-1)/2 backbone links in the core. This creates expense in WAN circuitry, as well as in router and WAN switch hardware (channelized or ATM technology can reduce these issues).
Moreover, PVC sizing requires that the traffic levels between any two distribution networks should be well understood, or that the network has the capability to circumvent congestion. Although traffic engineering calculations and circumventing congestion are common in the telephone network, common IP networks and their associated routing protocols do not provide this capability as readily. One good reason is that the resources required by any TCP/IP session are not known a priori, and IP networks are traditionally engineered as best-effort. Chapter 14 explores how to bypass best-effort by providing differentiated service in IP networks.
A full PVC mesh can also obviate one of the benefits of multiplexing, or trunking, in a best-effort network. Round-trip time and TCP window size permitting, any user can burst traffic up to the full line rate of the trunk. Furthermore, the routing complexity in a full mesh can consume bandwidth, computational, and operational management resources.
Most backbone topologies are, therefore, initially designed based on financial constraints, such as user population density, or application requirements; and WAN service availability. This initial design can be subsequently refined quite effectively by statistical analysis of traffic levels after the backbone is operational, and the availability of new WAN technologies is known. Data network requirements analysis is a relatively new art. See [McCabe, 1998] for thorough coverage of this area.
Building the Backbone Topology
Because you have a basic need for resilience in the backbone, a good starting point for the backbone topology is a ring connecting all distribution networks. This ring could represent the minimum cost of WAN circuits, compromised by an initial estimate of major traffic flows, and possibly some very particular delay requirements (although this is rare, with notable exceptions being high-performance networks).
Next, existing links can be fattened, or direct connections between backbone routers can be added as required or as is cost-effective. This incremental approach should be considered when selecting WAN technologies, routing nodes, and interface types.
Backbone routing protocols, such as IBGP, properly coupled with OSPF, IS-IS, and Enhanced IGRP, can rapidly circumvent failures by simple link-costing mechanisms. However, the bandwidth allocations with the core topology should consider failure modes. What happens when the ring is broken due to WAN or node failure? Is the re-routed path sufficient to carry the additional traffic load? Although TCP performs extremely well in congested environments compared with other protocols, it is still possible to render the network useless for most practical applications.
Analysis of historical traffic levels, captured by SNMP, for example, provides for a relatively accurate estimation of the consolidated load on the remaining links during various failure modes.
Traditionally, the use of a ring topology made it difficult to estimate the traffic levels between individual distribution networks. SNMP statistics, for example, provided only input and output byte counts for WAN interfaces, making it difficult to determine the appropriate sizing for new direct links between distribution networks.
Typically, this had to be accomplished using a cumbersome approach, such as "sniffers" on WAN links, or through accounting capabilities within routers that scaled rather poorly. However, IP accounting facilities, such as Netflow, now provide a scalable way for network managers to collect and analyze traffic flows, based on source and destination addresses, as well as many other flow parameters. This significantly eases traffic engineering and accounting activities. It is now possible to permanently collect and archive flow data for network design or billing purposes.
NOTE
Netflow is a high-performance switching algorithm that collects comprehensive IP accounting information and exports it to a collection agent.
Load sharing is possible on the backbone network. With Cisco routers, this can be either on a per-packet or a per-flow basis. The latter usually is recommended because it avoids possible packet re-ordering, is efficiently implemented, and avoids the potential for widely varying round-trip times, which interfere with the operation of TCP. This is not a problem for per-packet load sharing over parallel WAN circuits, but it can be a problem when each alternate path is one or more routed hops.
It is possible to connect regional networks directly, avoiding the backbone altogether and possibly providing more optimal routing. For example, in Figure 4-2, the DCs in SFO and LAX could be connected by a direct link. Traffic between the SFO and LAX regional networks could then travel over this link rather than over the backbone.
However, this exercise should be viewed as the effective consolidation of two regional distribution networks, and the overall routing architecture for the newly combined regions should be re-engineered to reflect this.
On an operational note, the backbone network and routers may be under different operational management teams to the regional networks. One historical example is the arrangement between the NSFNET backbone and the regional networks described in Chapter 1. Today, many smaller ISPs use the NSPs for WAN connectivity.
In this situation, the routing relationship between the backbone and the distribution networks is likely to be slightly different because an Exterior Gateway Protocol such as BGP will be used. In this book, the operators of the backbone and regional networks are generally considered to be the same, which makes it possible for the two to share a hierarchical IGP. In Chapter 16, "Design and Configuration Case Studies," you will examine a case study for scaling very large enterprise networks in which this is not the case.
Distribution/Regional Network Design
The role of the regional network is to route intra- and inter-regional traffic. The regional network generally is comprised of a DC as the hub and a number of access POPs as the spokes. Usually, two redundant routers in each regional network will connect to the backbone.
DCs may also provide services such as Web-caching, DNS, network management, and e-mail hosting. In some cases, the latter functionality may be extended into major POPs.
Placement of DCs is generally an economical choice based on the geographical proximity to a number of access sites. However, this does not mean that an access POP cannot be a mini-distribution center or transit for another access POP, but this is the exception rather than the rule.
When an access POP site provides such transit, and when that transit is the responsibility of the service provider, it should be considered part of the distribution network functionality.
Although the DC may be the center of a star topology from a network or IP perspective, this does not limit the choice of data-link or WAN connectivity to point-to-point links. Frame Relay or other cloud technologies can beand often areused to provide the connectivity from the customers, or from other distribution and access sites to the DC. Even within the DC, a provider may utilize Layer 2 aggregation equipment, such as a Frame Relay or ATM switch, or even an add/drop multiplexor.
A major DC typically consists of many routers, carrying either intra-regional or backbone-transit traffic. As more customers receive service from the DC, the higher the stakes become. Therefore, the backbone and intra-distribution network infrastructure must become more reliable.
A common option at major DCs is to provide dual aggregation LANs, dual backbone routers, and dual backbone WAN connections, as shown in Figure 4-3. This approach also can provide an element of load sharing between backbone routers. Of course, a single aggregation LAN and single backbone router will also serve this purpose. It is important to weigh the cost-versus-reliability issues, and bear in mind that most simple MTBF calculations consider hardware, but often ignore both software bugs and human error.
FDDI rings are a logical choice for the aggregation LAN because of their inherent fail-over mechanisms. However, with the development of low-cost/high-reliability LAN switches based on FDDI, Ethernet, or ATM technologynot to mention the ever-increasing intra-DC traffic levelsit is not uncommon to implement the dual aggregation LANs using switched media. IP routing circumvents LAN failure at either the single line card or the single switch level, as discussed in upcoming chapters.
Of course, many other critical reliability issues have not yet been considered. These include facilities, such as power supply and the choice of router and switching equipment.
NOTE
The distribution network is hierarchical. Router dist3 is located as an access POP, which services fewer customers, and therefore is not a resilient design.
The backbone/distribution/access hierarchy can be bypassed to achieve lower delays at the expense of reliability. Customer 4 may connect directly to router core2.sfo. However, if core2.sfo failsalbeit a rare eventcustomer 4 is effectively cut off from the network. Alternatively, customer 4 may have a backup connection via dist3.sfo.
This arrangement is satisfactory, provided that it does not confuse the role of each router. For example, directly connecting customer routers to the core router indicates that they may have to perform dial-up authentication, packet and router filtering, and packet classification. Not only will this occupy precious switching cycles on the core router, but it also could mean running a larger and possibly less reliable software image.
Other possible failure modes include the following:
Core1All intra-network traffic is routed through core2. All traffic to other ISPs is also routed through core2, presumably to another NAP connected to a backbone router elsewhere in the network.
Ds1Traffic destined for a remote distribution network is switched through ds2, as is traffic destined for other locations in the local distribution network.
Dist1Customer 2 is re-routed through Dist2.
Dist3Customer 3 is cut off.
It is worth noting that any resilience at Layer 3 results in routing complexity. This is examined in detail in Part II. As a matter of policy, the network service provider may choose not to allow customers to connect to core routers or even to dual distribution routers.
However, in the enterprise environment, reliability affects user satisfaction. In the commercial environment, this may affect their choice of provider. Policy that simplifies engineering must be carefully balanced against customer requirements.
Policy also must be balanced against the risk of human error. A resilient routing environment might be more reliable in theory, but in practice it might have a greater risk of human configuration error, and possibly algorithmic or vendor implementation flaws.
Access Design
In most cases, an access router serves a large number of customers. With modern access technology, this number can reach the thousands. As a result, resilient connectivity to the distribution routers is recommended. This may be accomplished using a self-healing LAN technology, such as FDDI. Alternatively, as with the connectivity between distribution and backbone routes, this may involve the use of redundant LAN switches. If the access router is the only node in a small POP, redundant WAN connections to the nearest DC are an option.
The design of the access topology is generally a choice of WAN technology between the CPE and the access router. For redundancy or load-sharing purposes, two or more links may be homed into the same access router or possibly onto different access routers. This is an issue of provider policy and capabilities.
Although the topology of the access network is relatively simple, it is here that the "policing" of customer connections, in terms of traffic rates and accounting, QoS, and routing policy, occurs. The configuration and maintenance must be executed carefully. The consequences of a router misconfiguration can be severe.
Summary
Network design involves the art and science of meeting requirements while dealing with economic, technological, physical, and political constraints. Scalability and extensibility are the hallmarks of a successful large-scale network design, and are encouraged through layering, modularization, and hierarchy. Randomization, soft state, dampening, separation of the control plane, regionalization, and optimizing the common case are also important considerations for routing protocols and the overall routing topology.
Although requirement analysis is an important aspect of design, it should be viewed as an ongoing task and should be ratified by the collection of traffic statistics that describe actual network usage.
By categorizing routers into the roles of backbone, distribution, and access, you will simplify the hardware/software combinations and configuration complexity required for any particular router. This consequently simplifies the operational support of the network.
Within the various tiers of the hierarchy, the topologies of ring, star, bus, and mesh may be employed. The choice depends on reliability, traffic, and delay requirements. In the case of WAN topologies, carrier service pricing also could be a determining factor.
Review Questions
1. If you need to support protocols other than IP in a large network, what would you do?
2. When would you consider breaking the hierarchy of a network design by linking distribution networks directly?
3. ATM is an ideal technology to grow a ring backbone to a partial mesh, and then to a full mesh. Does this make it a better choice for a backbone technology than point-to-point links? Why or why not?
4. Could you use different routers in your access, distribution, and core networks?
Answers
For Further Reading . . .
The available literature on network design (other than an abstract mathematical treatment) is surprisingly small. If you have well-known requirements, McCabe's book is unique in its treatment of network design through requirements and flow analysis.
Bennett, G. Designing TCP/IP Internetworks. New York, NY: John Wiley & Sons, 1997.
Galvin, P. B. and A. Silberschatz. Operating System Concepts. Reading, MA: Addison-Wesley, 1997.
Keshav, S. An Engineering Approach to Computer Networking. Reading, MA: Addison-Wesley, 1997.
McCabe, J. Practical Computer Network Analysis and Design. San Francisco, CA: Morgan Kaufmann Publishers, 1998.
Pressman, R. Software Engineering: A Practitioners Approach, Fourth Edition. New York, NY: McGraw-Hill, 1996.
