How Applying an AI-Defined Infrastructure Can
Total Page:16
File Type:pdf, Size:1020Kb
Cognizant 20-20 Insights Digital Systems & Technology How Applying an AI-Defined Infrastructure Can Boost Data Center Operations An artificial-intelligence-based infrastructure that uses the data available within the data center to optimize and automate infrastructure operations can enhance operational efficiencies and improve the quality of service offered to the business. Executive Summary It is hardly news that artificial intelligence (AI) is entering Moreover, the use of AI within the data center is growing the mainstream. In fact, Gartner predicts that by 2022 steadily. In this context, AI is being used to predict and AI will contribute nearly $3.9 trillion in business value automate many of the tasks that humans currently globally,1 and that 40% of all application development perform. This concept is known as an AI-defined data projects will have AI developers on their teams. Not center.3 As automation and the use of code to run the surprisingly, Gartner also places AI-based technologies data center through software-defined technologies strongly in the top trends for 2019.2 mature, data center operations are becoming much more efficient, with fewer mistakes due to manual August 2019 Cognizant 20-20 Insights interventions. The logical next step in increasing elements of its IT infrastructure operations.6 Other performance is the use of an AI-defined vendors such as Extreme Networks are reportedly infrastructure.4 readying AI to enhance operations capabilities within networks. Recent studies by EMC and Intel for Forbes Insights5 reveal that 70% of organizations classified This white paper explores the evolution of the data as leaders in digital transformation believe that center and details how an AI-defined infrastructure data and analytics will become integral to running can help businesses operate more efficiently. IT infrastructures within the next two to five years. It also provides high-level guidance on how IT Organizations such as Hitachi Vantara have already organizations can implement their own AI-defined begun incorporating AI technologies into storage data centers. Benefits of an AI-defined infrastructure The advantages — primarily operational — of using an AI-defined infrastructure include the following: ❙ Categorizing incidents automatically highlights ❙ Enhanced security and multilingual services of the key areas of the data center that operational an AI-defined infrastructure allow IT operations teams must focus on. to better use global teams. ❙ Using automated IT systems reduces the ❙ Increased data-driven insights into data center amount of human error in operational activities. operations raise awareness among leadership ❙ AI-based interactions with end users enhances teams for making efficiency gains. the quality of services provided to the business. ❙ Anomaly detection and quick automated ❙ Automatic assignment of capacity within the responses to threats further enhance the data center enables optimum cost utilization for security of data center operations. infrastructure. ❙ Predictive and preemptive resolution of incidents reduce down time for business applications. 2 / How Applying an AI-Defined Infrastructure Can Boost Data Center Operations Cognizant 20-20 Insights An AI engine for the data center The heart of an AI-defined data center is its AI engine. Figure 1 illustrates the architecture of AI-defined data centers. First, data from various sources needs to feed into then processes the data and provides an output, a central data repository, or data lake. The data which can be used either for reporting or to alert from that data lake is then fed into an AI engine, operations staff to take further steps or perform which features a combination of machine learning some activity automatically through automation (ML) and other AI-based capabilities. The engine technologies. Anatomy of an AI-defined data center Data Reporting/ Sources Alerting Data Lake Output Automatic AI Engine Operations Figure 1 Machine-learning algorithms 3. Clustering: These algorithms group data sets automatically into clusters based on similarities The four types of ML algorithms typically used for between data values. There are several clustering data learning and prediction are: algorithms available, all of which group data 1. Regression: These algorithms are used to values slightly differently. An example: “How do I forecast a numerical value based on the pattern group these servers based on their similarities in observed for the data series. An example: “What performance?” will the CPU utilization of my server be at 10 pm 4. Anomaly detection: These algorithms are used tomorrow night?” to automatically detect outliers in a data series 2. Classification: These algorithms are used to when the observed values do not follow the classify a data set into one of several predefined expected pattern. An example: “How do I detect categories. An example: “What is the priority a difference in network behavior when a virus has level of the ticket received — p1, p2 or p3?” attacked the network?” 3 / How Applying an AI-Defined Infrastructure Can Boost Data Center Operations Cognizant 20-20 Insights Figure 2 shows the mapping of potential data inputs available in a data center to potential outputs of ML algorithms. Mapping data inputs to outputs through ML algorithms Anomaly Automatic Category Data Inputs Regression Classification Clustering Detection Operations End User • Patch levels • User base • Security risk • Automatically • Anomalies • Reporting Computing • User allocation predictions classifications grouping of in patching and planning users based statuses over for patch • App service • Predicting • User type on feature sets time management status patching classifications available requirements • App service • Anomalies in • Automatic types user access patch classification • Anomalies in management services running for failures Server • Patch levels • Patch level • Clustering of • Anomalies in • Automatic Operating • Server roles classifications servers based patch levels patch System on their usage management • App service • Classification • Unexpected characteristics for failures status of server types server roles to business • Unexpected • Services functions services running outage • Classification management of service types to business processes Server • CPU, memory, • Data center sizing • Utilization levels • Clustering of • Unexpected • Auto-scaling Infrastructure storage, predictions (high/medium/ infrastructure utilization • Automatic networks • Utilization low) based on usage behaviors rebooting utilization characteristics predictions • Alerting for • Cooling, power anomaly and data center analysis. consumption • Serial numbers and asset inventory Storage • IOPS • Utilization • Utilization levels • Grouping of • Unexpected • Automatic • Utilization, and capacity (high/medium/ LUNs based utilization provisioning capacity predictions low) on usage behaviors of additional • Classification parameters storage into storage types Networks • Bandwidth • Utilization • Utilization levels • Grouping of • Unexpected utilization and capacity (high/medium/ network zones utilization • Ports and traffic predictions low) based on usage behaviors types • Unexpected ports/traffic types Figure 2 (continues on next page) 4 / How Applying an AI-Defined Infrastructure Can Boost Data Center Operations Cognizant 20-20 Insights Anomaly Automatic Category Data Inputs Regression Classification Clustering Detection Operations Firewall • Type of firewall • Classification • Unexpected • Source, into business/ network destination, infrastructure/ transactions ports used others Security • Usage anomalies • User access • Classification • Grouping of • Unexpected • Blocking • Authentication predictions into user types usage patterns user access of insecure (location, transactions • User access IP address, • Automatically • User location authentication moving virus • User IP address attempts, etc.) workloads to DMZ Databases • Usage • Utilization • Utilization levels • Clustering of • Unexpected • Utilization and capacity (high/medium/ databases based utilization predictions low) on usage behaviors Monitoring • Outage metrics • Outage • Criticality • Clustering • Nonordinary • Automatic • Logs predictions classifications outage types monitoring alerts ticket (P1, P2, etc.) generation • Preemptive fixes Load • Uptime • Downtime • Classification • Unexpected • Auto-scaling Balancing • Certificate usage predictions into normal/HA/ behaviors • Invoking DR DR scenarios Service Desk • Ticket metrics • Ticket predictions • Classification • Clustering • Increase in • Auto- • Frequently into P1, P2, P3, tickets based tickets after assigning searched topics etc. on features for patching or tickets based • Classification further analysis upgrades on priority into responsible levels operational team Licensing • Usage metrics • Software usage • User workload • Grouping • Spike in usage – & Software • Licensing terms predictions segmentation of servers/ e.g., OSS/FOSS Metering • Licensing applications software calculations (for resulting in example, with Java consolidation for SE, processor- reduced TCO based metric for • Utilization trends servers and named of license types user plus XX- based metric for desktops) • 3. Licensing recommendations Figure 2 (continued) 5 / How Applying an AI-Defined Infrastructure Can Boost Data Center Operations Cognizant 20-20 Insights If the CPU of a virtual machine is expected to spike every weekend, then the AI engine can automatically