Shadow Configuration As a Network Management Primitive
Total Page:16
File Type:pdf, Size:1020Kb
Shadow Configuration as a Network Management Primitive Richard Alimi, Ye Wang, Y. Richard Yang Laboratory of Networked Systems, Yale University New Haven, CT, USA ABSTRACT For example, Cisco has built the NSite [11] facility to test network Configurations for today’s IP networks are becoming increasingly configurations before actual deployment. However, for most com- complex. As a result, configuration management is becoming a panies, the cost of maintaining a testbed sufficiently similar to the major cost factor for network providers and configuration errors operating network is prohibitive. are becoming a major cause of network disruptions. In this paper, Given the limitations of these existing approaches, configuration we present and evaluate the novel idea of shadow configurations. modifications are frequently deployed into the operating networks Shadow configurations allow configuration evaluation before de- without realistic testing. As a contrast, software developers depend ployment and thus can reduce potential network disruptions. We mostly on debuggers and actually running their programs before demonstrate using real implementation that shadow configurations deployment. They run unit and regression tests for correctness and can be implemented with low overhead. conduct stress tests to validate the programs under load. It would be difficult to imagine the extent of software errors if programs were Categories and Subject Descriptors: C.2.1 [Computer Commu- deployed after only passing through analysis or simulation tools nication Networks]: Network Architecture and Design – Network without actually running on the target platform. However, there is communications; C.2.3 [Computer Communication Networks]: Net- no such capability for IP network configuration [38]. work Architecture and Design – Network Operations In this paper, we propose such a novel capability called shadow General Terms: Algorithms, Design, Management. configurations. With shadow configurations, a network operator may specify two configuration files for a router: one real (current) Keywords: Network Management, Network Diagnostics and one shadow (alternate). The shadow configuration files on a set of routers form a shadow configuration that the network opera- tor intends to replace the current configuration files. The operator 1. INTRODUCTION can test the shadow configuration files on the actual network with- Modern IP networks are becoming increasingly complex to con- out enabling them as the network’s real configuration. Running on figure, as these networks continue to evolve to offer multiple ser- the existing network infrastructure, this capability is low cost, and vices (e.g., both routing and access control), integrate equipment thus may be utilized in daily operations. During the testing process, from multiple vendors, and conduct continuous performance and the current network configuration is still running and forwards real feature tuning. As a result, it is difficult to generate and maintain traffic; the shadow configuration carries only testing traffic and will the configuration even for a moderately-sized network. A recent not cause disruptions to the operation of the current configuration, survey [40] found that configuration errors are a large portion of even if there are errors in the shadow configuration. The operator operator errors, which are in turn the largest contributor to failures conducts correctness and performance tests on the shadow config- and repair time. Another survey [29] found that more than 60% of uration. Our usage of the term “shadow” is motivated by computer network downtime is due to human configuration errors. It further graphics, where instead of directly modifying the current display showed that more than 80% of IT budgets are allocated towards buffer, the display system often uses a shadow buffer to compute maintaining the status quo, a percentage that will only increase due the next frame to be displayed. to “increased complexity, lower budgets, and continued business In particular, by running a set of configuration files directly on demand.” the actual network to which they will be applied, a shadow con- One way to reduce configuration errors is to use configuration figuration allows a network operator to evaluate the integrated ef- generation tools (e.g., [2]) and/or validate the configuration files us- fects of alternate configuration files, router software implementa- ing static analysis or simulation (e.g., [15,17,22,37,53]). Although tion (including protocol mis-implementations!), the physical net- these tools can be quite useful, for example, it has been noted that work status, and dynamic information such as imported external the configuration analysis tool NetDB provides AT&T significant route advertisements. Many integrated effects on routing are natu- cost savings [47], these tools are inherently limited in the problems rally summarized by the forwarding information base (FIB) at each that they can detect. In particular, since configuration files alone router. We take advantage of the compact FIB representation and do not determine the behaviors of a network, analyzing only the develop techniques to analyze the FIBs for configuration validation configuration files based on an abstract model of the network and and adjustment. equipment behaviors may leave many problems undetected. Further exploring its benefits, we show how shadow configura- Recognizing the limitations of static analysis and simulation tools, tion allows a network operator to evaluate, before actual deploy- some network operators and equipment vendors build test networks. ment in the real network, whether a set of configuration changes will have the desired effect on network performance. Such realis- tic performance evaluation reduces the dependency on unrealistic models or assumptions of router processing or the network. Also, Permission to make digital or hard copies of all or part of this work for the availability of the on-going real traffic in the actual network personal or classroom use is granted without fee provided that copies are allows the operator to duplicate a controlled portion of it as test- not made or distributed for profit or commercial advantage and that copies ing traffic in addition to generated testing traffic. This reduces the bear this notice and the full citation on the first page. To copy otherwise, to need to generate realistic testing traffic patterns. One potential is- republish, to post on servers or to redistribute to lists, requires prior specific sue of conducting testing on the shadow configuration is that if we permission and/or a fee. naively send both shadow and real traffic, the combined traffic may SIGCOMM’08, August 17–22, 2008, Seattle, Washington, USA. Copyright 2008 ACM 978-1-60558-175-0/08/08 ...$5.00. overload some network links. Thus, we develop a novel technique, 111 Configuration Management Delta-Debugging referred to as packet cancellation, to allow both real and shadow Configuration User Interface FIB analysis Shadow Traffic Control traffic to be forwarded in parallel without overloading the network. After the operator is satisfied with the new configuration, she can Run-time Shadow Management simply quickly and smoothly swap the real and shadow configura- OSPF Shadow Management OSPF tions with minimal network disruptions. We develop a commitment BGP BGP capability for shadow configurations to reduce the effects of churn IS-IS Commitment IS-IS and convergence. This usage pattern can be viewed as “two-phase commitment” for network configurations. Forwarding Engine Shadow-enabled FIB To demonstrate feasibility, we extend the Linux kernel and im- Shadow Bandwidth Control plement necessary components to support shadow configurations in NIC0 NIC1 NIC2 both Quagga [41] and XORP [23] software routers. We show that shadow configurations can be implemented efficiently, with only Figure 1: System architecture for network management with 12 additional lines of code on the kernel’s forwarding fast path for shadow configurations. packets not using packet cancellation, and no code changes to rout- the effects, and makes adjustments before switching to the new con- ing processes. The FIB memory increase to support both real and figuration. This step can be particularly useful as multiple surveyed shadow configurations is less than 35% for the worst-case router network operators commented that it is common for issues to arise under a variety of shadow configurations for a large US tier-1 ISP; after a maintenance upgrade. the average is much smaller, less than 7%. At run time, our shadow- enabled forwarding engine under heavy traffic has no more than Configuration Parameter Tuning: Many network operators need 1.2% CPU usage overhead with a shadow configuration installed. to tune configuration parameters to address performance or secu- We also demonstrate the usage of shadow configurations. We rity issues. For example, a network operator may conduct traffic show in real implementation that the commitment ability avoids engineering to improve network performance, and many trafficen- the transient routing convergence period under router maintenance, gineering techniques (e.g., [18, 19, 39, 42]) require the modifica- shutdown and OSPF weight changes. We demonstrate our packet tion of configurable parameters (e.g., OSPF weights or egress point cancellation technique in a usage scenario where the operator tests selections). However, such parameter adjustments may cause dis- the impact of a new configuration on a streaming video application. ruptions due to human error or routing reconvergence.