Issue 1

1 TCPWave = Agility + Innovation Ensure your Network Vendors 4 Research From : Look Beyond Network Vendors for Network Provide Agile Innovation Innovation

Incumbent Network Vendors are challenged to provide agile innovation and automation

TCPWave = Agility + Innovation

Summary IT networking is changing rapidly. Enterprises continue to rely on incumbent network vendors to provide education and offer innovative features. The features must be delivered ever more rapidly with near zero downtime. Despite vast investments for the latest equipment by traditional network vendors, the enterprise has not been able to keep pace with business demands and transform data centers from fragile to agile.

Overview Incumbent network vendors are limited by their legacy designs and use of outdated software For example, an enterprise can not rely on tools. The use of innovative open source tools is incumbent DDI vendor to figure out how to daunting to incumbent vendors because they do integrate with all the various cloud providers not comprehend them and redesign is costly. Many leveraging all the latest technology that exists incumbent vendors prefer to keep their solutions today, unless they choose one of the newer more proprietary because they have limited control over nimble vendors that are willing and able to work open source software and the proprietary solutions with the open source community. lock in their customers. Hence the incumbent venders are failing to take advantage of the latest Gartner has investigated this gap between the innovations and advancements that are available promises made by incumbent network vendors today. In lieu of taking advantage of better and the network innovation required to transform technology offered by a vast community, vendors businesses. are limited to their own developers working with their legacy technology. Using DDI as an example The core architecture of an enterprise’s network consists of DNS, DHCP and IP Address 2

Management (DDI). If DNS and/or DHCP go down, Cloud architecture is a set of processes made up communication as we know it would not be of more than a dozen components that control the possible: Email, VoIP, internet access, etc. all cease most important aspects of a cloud. Together, they to operate. DDI is the heart of the network and form the Cloud stack. Open source or proprietary, always expected to operate without failure. the cloud stack defines and implements a high efficiency menu driven product with a self-service Why Enterprise needs to change how to component. budget for DDI in this constantly evolving digital world An incumbent DDI vendor who provides a virtual Because DDI is core infrastructure it is taken for machine of their product that can be run in a cloud granted, and upgrades are budgeted on an as with an HTTPS GUI connection can pass for cloud needed basis. DNS gets attention by the security enabled, but it is far from cloud compliant. team after there is a DNS breach, or worse yet, when a DDoS attack hits the customer. In Definition of a cloud compliant DDI product these cases, the security team acquires budget Compliant cloud DDI must fully embrace the to upgrade DNS features that can protect the cloud technology. This is difficult for an incumbent enterprise from that attack. vendor to do because they are limited by legacy design and database, outdated third-party tools Enterprises must also exercise caution in and lack of adopting state of the art open source understanding the end-of-life policies from the DDI tools. The incumbent vendor commonly fears the vendors. The return on investment (ROI) must be lack of control without proprietary components. carefully factored. In addition to ROI, the mission critical DDI infrastructure must be engineered so Compliancy starts by architecting the core that the DDI devices have a maximum shelf life. around state of the art interfaces that are needed to communicate with the cloud stack such as This is considered standard practice; however, REST. Incumbent vendors provide command Line standard practice does not work for emerging interfaces (CLI) first and SOAP/XML as an add-on. technologies that improve efficiencies and agility REST will be squeezed into a product to allow while maintaining uptime. the vendor to stay in the game, however, the internal processes are sub-optimized using CLI. Outside of virtual computing, the DDI team is An innovative vendor will design the product from the last to be involved with corporate strategy scratch using REST and state of the art Java with simply because DDI is just expected to be always CLI as an alternative until customers can get their available and work. APIs ported to REST.

Corporations are adopting cloud, virtual computing, An incumbent vendor will typically support Perl and and automation at the CTO level. The first mistake Shell scripts as they have for the last 15+ years. is that the DDI team does not know the plans until one of the new architecture teams sends email TCPWave DDI asking why portions of DDI are not working with A cornerstone of the recommended network this new technology. transformation is to automate network changes. As a more recent entrant to the DDI market, TCPWave When the incumbent vendor is brought in to solve is well suited to help the enterprise achieve the problem. It is too late and the solutions are these objectives. TCPWave includes automation limited to that vendor’s proprietary offering. capabilities at the core of the architecture.

Using cloud computing as an example. the Focusing on automation does not need to incumbent vendor will present how they support sacrifice information security. Attempting to cloud computing, but in reality, they have only accomplish network automation for the core a workaround to run their product as a virtual network services by leveraging the programming machine in someone’s cloud space. capabilities of the cloud management consoles must not cause a breach to the information security policies of an enterprise. DDI vendors 3

must offer the ability to ‘pull’ from the cloud rather • Integrations include VMware, ServiceNow, than allowing the cloud providers to ‘push’ to the Terraform, and others. TCPWave can quickly DDI management using passwords in open text. integrate with 3rd party tools such as Ansible The core functionality of TCPWave is to provide a and Puppet for enabling automation and secure DNS/DHCP solution. reducing manual operations.

• TCPWave has been built with the latest • To facilitate tracking and auditing changes, technology advancements including REST, TCPWave allows create, undo and audit Cloud, Integration, and Uptime. using change ticket number from any service manager. • Everything in TCPWave is based on REST, including the GUI. If an operation can be • TCPWave allows root access to appliances. For performed in the TCPWave GUI, it can be security, the root password can be protected performed via REST. The REST interface is 100% with CyberArk or HashiCorp Vault. documented, including examples. • To improve agility, TCPWave is offered in • Swagger can be enabled via the GUI to quickly software appliances, cloud-based appliances, prototype REST calls. By default, Swagger is hardware appliances, or any combination turned off to minimize image size and improve thereof. security. • The TCPWave database is completely open and • TCPWave is “Born in the Cloud, Made for documented. Database Schema provided with the Cloud.” TCPWave provides the following a customer read-only database user id for SQL Cloud integrations out of the box: AWS/RT 53, access. , Microsoft Azure, Oracle/DYN, Verisign, Akamai, OpenStack. Communicate with your • N>2 “All Active” High Availability IPAM servers Clouds of choice in minutes using incorporated and database. templates and a single pane of glass. Source: TCPWave 4

Research From Gartner: Look Beyond Network Vendors for Network Innovation

To support digital business, infrastructure and Strategic Planning Assumption operations leaders responsible for networking must Through 2020, enterprises that embrace web-scale transform their data center networks from fragile networking principles will improve networking to agile. This demands innovation, which is being efficiency to support network device-to-admin driven by large network operators, such as cloud ratios higher than 200:1, which is an increase from providers, rather than established network vendors. approximately 100:1 today.

Key Challenges Introduction • Enterprise networking teams rely heavily on As Gartner clients transition to digital business, established network vendors for guidance enterprise network teams must deliver data center regarding network operations. network infrastructure rapidly and on-demand. This is difficult, and enterprises cite that agility as the • Established networking vendors present biggest challenge in their data center networks.1,2,8 themselves as trusted advisors to their Improving network agility requires innovation enterprise clients; however, they have not regarding network operations. However, although guided customers toward dramatic operational network vendors market themselves as innovators improvements, particularly in the data center. and trusted advisors, they are not driving the needed operational innovation into the market. • Networking vendors often market their products to the enterprise as “highly innovative,” but the innovations have not delivered dramatic Innovation in network operations is being created improvements in network operations. and driven into the market by hyperscale web providers — not network vendors. • Cultural issues in enterprise network teams, including risk avoidance and rigid change management practices, have led to manual There was a lot of hype in the 2012 to 2013 processes and the use of command line time frame that networking vendors would interface as the primary configuration tool. This transform network operations via software- makes it difficult for enterprises to operate at defined networking (SDN; see “State of SDN: If the speed that digital business demands. You Think SDN Is the Answer, You’re Asking the Wrong Question”). However, these dreams remain Recommendations unrealized. For example: I&O leaders responsible for delivering agile data center networks should: • It is common for data center network requests to take days to fulfill.1,8 • Emulate relevant principals of hyperscale operators by implementing web-scale practices • At least 70% of Gartner clients still use in their data center network to improve network manual command line interface (CLI)-based agility and efficiency. changes as the primary mechanism for network configurations.1,2,8 • Look beyond established networking vendors to deliver operational innovation, by changing key • The number of active ports supported per performance indicators and shifting spending LAN full-time equivalent (FTE) has worsened from premium products to premium people. by more than 10% — from 3,412 ports/FTE in 2013 to only 2,993/FTE in 2016 (source = • Reprioritize data center networking ITKM, LAN, Multiyear). investments by focusing primarily on manageability, automation and broader orchestration capabilities, instead of hardware, vendors and speeds/features. 5

FIGURE 1 Primary Method for Making Network Changes

Source: Source: Gartner (January 2018)

The discrepancy is not a lack of network features, Network products that enable dramatic operational because vendors offer the necessary features that improvements often require enterprises to enterprises require. Instead, the issue is driving make substantial changes to their process and innovation into the market to improve network operations. This is challenging for vendors, because operations. This issue is most pronounced in the it ultimately lowers the barrier for customers data center, but also applies in wired campus to entertain competitive alternatives — clients switching. considering large changes are often more open to switching suppliers. Thus, incumbent Trusted Advisors? vendors have a stake in the status quo, which Established network vendors often position and allows high barriers to entry to remain in place. market themselves as “trusted advisors” to their Consequently, instead of driving their enterprise customers with deep domain knowledge, and customers toward open, vendor-agnostic tools, their customers rely heavily on them. As a result, established networking vendors delivered network more than one-third of Gartner clients base management tools based on proprietary software network product selection and/or operational and specialized to operate with their networking 1,8 tooling decisions based on what their incumbents products. We believe this is because abstracted, suggest.1,2,8 Network vendors could have led their automated, open and multivendor tooling clients via training and certifications that focused promotes and enables vendor independence, which on improving agility with automation, open conflicts with vendor goals of account control and tooling, such as Python, Ansible or Puppet. They product stickiness. also could have de-emphasized manual CLI-based Thus, we’ve witnessed most mainstream network configurations and vendor-specific management vendors continue to deliver products that offered capabilities, but they did neither. incremental improvements, such as improved Vendor Sales Trump Innovation speeds/feeds, that largely preserve network operations status quo. As a result, the most popular (and effective) network management Established network vendors have an inherent bias tools are offered by heterogeneous network and IT toward sales, which are often not aligned with operations vendors,1,8 rather than network vendors. maximizing network operations innovation. 6

Furthermore, network vendors overhype even than mainstream enterprises, which typically minor incremental feature enhancements as maintain device:admin ratios in the 100:1 “innovative,” to create the perception that they’re range.1,5,8 delivering innovation into the market. • Hyperscalers make tens of thousands of network changes and more a day, which is 100 “Over the past 25 years, ‘innovating’ for legacy to 1,000 times more than enterprise.1,3,5,6,7,8 vendor switching systems and routing systems has become a euphemism for ‘adding more The hyperscalers achieved this via several complexity.’” — Peyton Maynard-Koran (Future:Net networking paradigm shifts. First, they looked to Conference, 2017) manage risk to an appropriate level, rather than avoid risk at all costs. Hyperscale operators do not aim for networks that “never fail.” They aim to Enterprise Responsibility fail fast, fail small and fail in control. To achieve agility and resiliency, hyperscalers choose to Although networking vendors bear a large more-frequent low-impact outages, rather than responsibility for the lack of network operation infrequent high-impact outages (see Note 1). innovation, it is not 100% their fault. Enterprise IT and networking teams carry a degree of The hyperscalers automate relentlessly and live responsibility, for several reasons, including: by the mantra of “automation by default,” which includes network configuration, troubleshooting, • A strong culture of risk avoidance has led reporting and visibility. Thus, manual network to a desire for incremental changes that are changes are a rare exception. To this end, perceived to be safe (see “Avoid These ‘Bottom they prefer staff that is strong in automation 10’ Networking Worst Practices”). and can integrate systems via API. They build • Network team key performance indicators their networks in pods, based on standardized (KPIs) primarily reward stability,1,2,8 with little components, and scale out (not up), which incentive for operational agility simplifies troubleshooting and break/fix activities. This further allows them to introduce new and • Many network technicians are more loyal disruptive technologies, which is a key part of to their network vendor than to their enabling the introduction of innovation into their employers.1,5,8 networks. They also prefer slimmer fit-for-purpose software platforms, and disaggregate software from The culmination of these factors is a desire to limit hardware, so that they can innovate more rapidly change. Thus, enterprise networking teams have and reduce software bloat (see Note 2). been unwitting co-conspirators with the vendors to slow operational innovation. In addition to operating differently, another key aspect is that these hyperscalers have been active NetOps Innovation via Hyperscalers and vocal in driving this innovation into the broader market — for example: Hyperscale cloud providers (e.g., Facebook, Google, and Amazon) and others have massive- • Hyperscalers have publicly described the scale data center networks and provide self-service limitations of traditional vendor networking offerings. In essence, the hyperscale web providers products and/or their innovative approach to were forced to fill the operational innovation void network operations.3 that established networking vendors created. This required them to fundamentally re-examine how to • Facebook drove the creation of the Open operate networks that forged out-of-box thinking Compute Project (OCP), an industry group that has delivered radical contrast to traditional promoting open hardware and software designs enterprise network operations. The results are (see “Hype Cycle for Enterprise Networking and staggering, including: Communications, 2017”).

• We estimate that hyperscales maintain • Facebook and Microsoft contributed switching network device:admin ratios of 1,000:1 and hardware designs and/or software components higher,1,3,6,7 which is roughly 10 times better to the open-source community.3 7

NetOps Innovation via Open Source by more than 50%, and deliver services to the Another source of operational network innovation business at least 50% faster. Furthermore, we is the open-source community. During the past believe organizations that automate 70% of their seven years, multiple, open-source automation network configuration changes will reduce the tools (e.g., CFEngine, Saltstack, Puppet, Chef and number of unplanned network outages by more Ansible) have emerged. During the past two years, than half, compared with those that automate less Ansible has become one of the fastest-growing than 30%. network automation tools.1,5,6,7 This shows that innovation is coming from elsewhere in the Look Beyond Established Networking market, because these open-source projects were Vendors to Deliver Innovation not sourced from networking vendors (see Note 3). Unfortunately, we do not anticipate that established network vendors will make major Analysis changes in the near future, such as leading Implement Web-Scale Networking customers toward open automation or multivendor Practices in Your Data Center Network management tools, which create dramatic operational improvements. It is not feasible to mimic all the network operations practices of hyperscale web providers. They have massive scale, resources, ability to attract talent and economies of scale that most Organizations must take it on themselves enterprise can’t attain. However, enterprise (versus relying on networking vendors) to create don’t need to mimic all practices. In fact, if your innovation around network operations. enterprise can mimic just 1% to 10% of hyperscale practices, which is low-hanging fruit, you can still achieve dramatic improvements in efficiency and ”agility. We refer to this method of emulating Change How You Measure and Reward hyperscale operator practices as “web-scale IT.” Network Teams

Most enterprise network teams are held to KPIs

that focus primarily on uptime.1,2,5,8 Thus, network Web-scale IT is a pattern of global-class computing leaders must change how they measure and reward that delivers the capabilities of large cloud teams by instituting KPIs that align with business services providers in an enterprise IT setting. outcomes (see “Change How You Fund, Build and Measure to Achieve Network Agility to Support Digital Business”). KPIs that we recommend This topic is covered in depth in “Bring Web-Scale to improve the measurement and rewarding of Networking Concepts to Your Data Center.” Major network teams include: principals that should be applied to enterprise data center networks include: • Time to deliver services, such as ports and virtual local-area networks (VLANs), in response • Foster a culture that manages risks to a request appropriately, rather than one that avoids them at all costs. • The percentage of common network requests that can be delivered via on-demand self- • Segment the network into logical building service portals blocks to reduce risk and enable innovation. • The percentage of network changes that are • Automate relentlessly and standardize automated versus manual ruthlessly. These can augment existing availability-focused • Cross-train network teams with other groups, metrics, such as uptime. Furthermore, there are including server, cloud and even application metrics that drive both agility and availability, such personnel to enable cross-domain innovation. as: The bottom line is that we anticipate that • Mean time to detect an outage enterprises following webscale networking principles will improve their device:admin ratios 8

• The percentage of network outages caused personnel more loyal to their network vendors than by manual error (see “Map Infrastructure and to their employers combine to hinder operational Operations Metrics to Business Value”) innovation and change.

Shift Spending to People Reprioritize Data Center Networking In addition, we recommend that enterprises shift Investments investments away from premium networking When selecting a data center network products toward their existing network personnel. infrastructure, focus first on the operational aspects Although a limited amount of upfront training (e.g., switching extensibility and automation), is required, purposefully shifting network spend rather than speeds/features. This is because nearly from products to personnel can lead to yearly all vendors have the necessary form factors and savings of more than 25% in five years, while interface speeds (see “Magic Quadrant for Data improving network agility (see Note 4). To do Center Networking”). Thus, the ability of the vendor this, we recommend reducing your spending to support automation frameworks and fit into the on fully integrated (often proprietary) network broader orchestration tools in your environment is vendor solutions in favor of multivendor, software- more important. Essentially, we’re recommending based solutions that are more open (see “How to that you flip the traditional buying model for data Determine the Openness of Your Network Vendor”). center network switches. Next, invest the resulting equipment savings back into strengthening your team’s skills in such areas as Linux networking, Python, APIs and vendor management. Gartner has published on this topic in Pick how you want to manage your data center depth (see “It’s Time to Shift Network Spend From network first, then pick the vendor/products that Premium Products to Premium People”). can slot into that decision.

Ultimately, If you have the right people with the right skill sets, you can improve the agility and This is counter to traditional buying, in which efficiency of your data center network, regardless enterprises purchase switches, then, secondarily, of the physical switches or networking vendor. determine which tools should manage them. This is aligned with Gartner’s recommendation to decouple data center network configuration from Furthermore, networking must be included in data center network hardware (see “Building Data broader efforts to establish centers of innovation Center Networks in the Digital Business Era”). As in I&O to drive the adoption of new tools and a result, network teams will need to focus on APIs processes that integrate beyond just networking (see Note 5), plug-ins for broader orchestration (see “How to Kick-Start an I&O Innovation Center”). tools and Linux-based automation tools.

Contrarian View Evidence As described earlier, enterprises bear some 1 Gartner conducted more than 800 inquiries on responsibility for the lack of network operational the topic of data center networking in 2017. innovation. However, a contrarian view is that the enterprise — not the network vendor — bears 2 Audience polling at Gartner’s conferences, the primary responsibility for lack of operation including: innovation. For example, most data center switches sold in the past three years include support for • December 2017 Infrastructure and Data Center Ansible, Puppet and/or APIs that could improve Conference operations. However, few enterprise organizations actually use them (we estimate fewer than 10%). • What is the primary method of making Thus, network vendors would point out that network changes in your environment? CLI they’ve delivered the requisite capabilities to the on individual devices was the No. 1 answer market, but the enterprise just isn’t using them. (71% of respondents), n = 64. Thus, the contrarian viewpoint is that risk-averse network teams, misaligned KPIs around uptime, • What is the biggest challenge you face with rigid change management practices and network your network? The No. 1 answer (43% of 9

respondents) answered “agility — We can’t • How Google Invented An Amazing make changes fast enough” (n = 130). Datacenter Network Only They Could Create

• What best describes your approach to • A Look Inside Google’s Data Center Networks selecting network infrastructure? The No. 1 answer was, “I refresh with whatever • Microsoft Azure: Microsoft showcases the Azure my incumbent suggests,” with 34% of Cloud Switch (ACS) respondents (n = 106). • Netflix Chaos Monkey: Netflix uncages Chaos • What best describes your approach to Monkey disaster testing system network infrastructure KPIs? The No. 1 answer was “availability is all we measure” • Gartner published research: “Use Web-Scale 69% (n = 16). IT to Make Enterprise IT Competitive With the Cloud” • December 2017 Infrastructure and Data Center Conference 4 Based on briefings with technology vendors and inquiries with Gartner clients, we estimate at least • What percentage of your network changes 30,000 customers (and likely more) use Ansible are automated today? Fifty-four percent and/or Puppet in their environments, compared answered none, 37% answered 1% through with fewer than 1,000 combined customers 25%, 3% answered 26% through 75%, using OpenContrail, StackStorm, OS10 and nobody answered more than 77% and 6% OpenDaylight. answered “don’t know” (n = 64). 5 Outside of inquiry, Gartner networking analysts 3 Public examples of hyperscalers describing their interact with network engineers, network network infrastructure approaches include: technicians and other network practitioners via industry events, social media and other avenues. • Amazon: Datacenter Networks Are in My Way Although this is not officially tracked, we estimate this to be more than 100 interactions per year. • Linkedin 6 Gartner networking analysts interact with • Project Altair: The Evolution of LinkedIn’s networking personnel at companies that operate Data Center Network large data center networks (more than 1,000 switches) and/or that are implementing web- • LinkedIn OpenSwitch Project Altair scale principals. This includes e-commerce, Discussion SaaS providers, large financials and hyperscaler providers. Although this is not fully tracked, we • Facebook estimate it to be in the range of several dozen per year. • Facebook’s Data Center Fabric 7 Gartner analysts take inquiry and briefings • Introducing data center fabric, the next- with vendor clients that sell (and/or attempt to generation Facebook data center network sell) into large networking operators, including hyperscalers. We estimate this to be in the range • Introducing “Wedge” and “FBOSS,” the next of several dozen per year. steps toward a disaggregated network 8 Gartner analysts conducted more than 2,500 • Google inquiries on the topic of networking in 2017.

• 2017 Google Networking Research Summit Note 1 Keynote Talks Questions to Ask Vendors Regarding APIs • Site Reliability Engineering: How Google What percentage of the product features are Runs Production Systems (O’Reilly, ISBN exposed via the API, versus the graphical user 9781471929124) interface (GUI) or CLI? 10

Is the API well-documented? Note 3 Netflix Chaos Monkey Is the API open (published to the public)? As an example, Netflix intentionally introduces Is the latest API reverse-compatible with prior failures, such as disabling virtual machines (VMs) versions, and is there a guarantee that future APIs in their infrastructure via an application called will be reverse-compatible? “Chaos Monkey” to proactively identify issues.

Is there an active or growing community to assist Note 4 with usage and/or to provide a self-sustaining NOS Software Bloat model? Gartner estimates that fewer than 20% of the features in vendor network OSs are actually Are there samples to help me start? needed by any given enterprise.1,8

Are there prebuilt examples of common industry Note 5 integrations for platforms such as VMware vRealize, OpenStack or common IT ticketing Open Source systems? Commercial networking vendors have contributed to and fostered some open-source efforts, such Is there an associated software development kit as Juniper (OpenContrail), Brocade (StackStorm), (SDK)? (OS10) and OpenDaylight (many vendors). The combined impact of these efforts on the enterprise Note 2 has been dramatically less than either Ansible or Shifting Spend to People Puppet alone.4

The details of this are laid out in “It’s Time to Source: Gartner Research G00349636, Andrew Lerner, Shift Network Spend From Premium Products to 23 January 2018 Premium People.” The initial training outlay would be a two- to three-day course, per engineer.

Ensure your Network Vendors Provide Agile Innovation is published by TCPWave. Editorial content supplied by TCPWave is independent of Gartner analysis. All Gartner research is used with Gartner’s permission, and was originally published as part of Gartner’s syndicated research service available to all entitled Gartner clients. © 2018 Gartner, Inc. and/or its affiliates. All rights reserved. The use of Gartner research in this publication does not indicate Gartner’s endorsement of TCPWave’s products and/or strategies. Reproduction or distribution of this publication in any form without Gartner’s prior written permission is forbidden. The information contained herein has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. The opinions expressed herein are subject to change without notice. Although Gartner research may include a discussion of related legal issues, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company, and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartner’s Board of Directors may include senior managers of these firms or funds. Gartner research is produced independently by its research organization without input or influence from these firms, funds or their managers. For further information on the independence and integrity of Gartner research, see “Guiding Principles on Independence and Objectivity” on its website.