Understanding Log Analytics at Scale O'reilly Report | Pure Storage
Total Page:16
File Type:pdf, Size:1020Kb
C om p lim e nt s of SECOND EDITION Understanding Log Analytics at Scale Log Data, Analytics & Management Matt Gillespie & Charles Givre REPORT SECOND EDITION Understanding Log Analytics at Scale Log Data, Analytics, and Management Matt Gillespie and Charles Givre Beijing Boston Farnham Sebastopol Tokyo Understanding Log Analytics at Scale by Matt Gillespie and Charles Givre Copyright © 2021 O’Reilly Media, Inc.. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more infor‐ mation, contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Acquisitions Editor: Jessica Haberman Proofreader: Abby Wheeler Development Editor: Michele Cronin Interior Designer: David Futato Production Editor: Beth Kelly Cover Designer: Randy Comer Copyeditor: nSight, Inc. Illustrator: Kate Dullea May 2021: Second Edition February 2020: First Edition Revision History for the Second Edition 2021-05-05: First Release The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Understanding Log Analytics at Scale, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the authors, and do not represent the publisher’s views. While the publisher and the authors have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the authors disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. This work is part of a collaboration between O’Reilly and Pure Storage. See our statement of editorial independence. 978-1-098-10423-8 [LSI] Table of Contents 1. Log Analytics . 1 Capturing the Potential of Log Data 3 2. Log Analytics Use Cases. 13 Cybersecurity 14 IT Operations 19 Industrial Automation 24 3. Tools for Log Analytics. 31 Splunk 32 Elastic (Formerly ELK) Stack 33 Sumo Logic 34 Apache Kafka 34 Apache Spark 35 Combining Log Analytics with Modern Developer Tools 35 Deploying Log Analytic Tools 36 4. Topologies for Enterprise Storage Architecture. 39 Direct-Attached Storage (DAS) 42 Virtualized Storage 43 Physically Disaggregated Storage and Compute 44 5. The Role of Object Stores for Log Data. 47 The Trade-Offs of Indexing Log Data 49 6. Performance Implications of Storage Architecture. 51 Additional Considerations for Security Analytics 54 v 7. Enabling Log Data’s Strategic Value with a Unifed Fast File and Object Platform. 63 8. Nine Guideposts for Log Analytics Planning. 69 Guidepost 1: What Are the Trends for Ingest Rates? 70 Guidepost 2: How Long Does Log Data Need to Be Retained? 71 Guidepost 3: How Will Regulatory Issues Affect Log Analytics? 71 Guidepost 4: What Data Sources and Formats Are Involved? 72 Guidepost 5: What Role Will Changing Business Realities Have? 73 Guidepost 6: What Are the Ongoing Query Requirements? 73 Guidepost 7: How Are Data-Management Challenges Addressed? 74 Guidepost 8: How Are Data Transformations Handled? 75 Guidepost 9: What About Data Protection and High Availability? 75 9. Conclusion. 77 vi | Table of Contents CHAPTER 1 Log Analytics The humble machine log has been with us for many technology gen‐ erations. The data that makes up these logs is a collection of records generated by hardware and software—including mobile devices, lap‐ top and desktop PCs, servers, operating systems, applications, and more—that document nearly everything that happens in a comput‐ ing environment. With the constantly accelerating pace of business, these logs are gaining importance as contributors to practices that help keep applications running 24/7/365 and analyzing issues faster to bring them back online when outages do occur. If logging is enabled on a piece of hardware or software, almost every system process, event, or message can be captured as a time- series element of log data. Log analytics is the process of gathering, correlating, and analyzing that information in a central location to develop a sophisticated understanding of what is occurring in a datacenter and, by extension, providing insights about the business as a whole. The comprehensive view of operations provided by log analytics can help administrators investigate the root cause of problems and iden‐ tify opportunities for improvement. With the greater volume of that data and novel technology to derive value from it, logs have taken on new value in the enterprise. Beyond long-standing uses for log data, such as troubleshooting systems functions, sophisticated log analytics have become an engine for business insight and compli‐ ance with regulatory requirements and internal policies, such as the following: 1 • A retail operations manager looks at customer interactions with the ecommerce platform to discover potential optimizations that can influence buying behavior. Complex relationships among visit duration, time of day, product recommendations, and promotions reveal insights that help reduce cart abandon‐ ment rates, improving revenue. • A ride-sharing company collects position data on both drivers and riders, directing them together efficiently in real time, as well as performing long-term analysis to optimize where to position drivers at particular times. Analytics insights enable pricing changes and marketing promotions that increase rider‐ ship and market share. • A smart factory monitors production lines with sensors and instrumentation that provide a wealth of information to help maximize the value generated by expensive capital equipment. Applying analytics to log data generated by the machinery increases production by tuning operations, identifying potential issues, and preventing outages. Using log analytics to generate insight and value is challenging. The volume of log data generated all over an enterprise is staggeringly large, and the relationships among individual pieces of log data are complex. Organizations are challenged with managing log data at scale and making it available where and when it is needed for log analytics, which requires high compute and storage performance. Log analytics is maturing in tandem with the global explosion of data more generally. The International Data Corporation (IDC) predicts that the global data‐ sphere will grow more than fivefold in seven years, from 33 zettabytes in 2018 to 175 zettabytes in 2025.1 (A zettabyte is 10^21 bytes or a million petabytes.) What’s more, the overwhelming majority of log data offers little value and simply records mundane details of routine day-to-day operations such as machine processes, data movement, and user transactions. There is no simple way of determining what is 1 David Reinsel, John Gantz, and John Rydning. IDC, November 2018. “The Digitization of the World from Edge to Core”. 2 | Chapter 1: Log Analytics important or unimportant when the logs are first collected, and con‐ ventional data analytics are ill-suited to handle the variety, velocity, and volume of log data. This report examines emerging opportunities for deriving value from log data, as well as the associated challenges and some approaches for meeting those challenges. It investigates the mechanics of log analytics and places them in the context of specific use cases, before turning to the tools that enable organizations to fulfill those use cases. The report next outlines key architectural considerations for data storage to support the demands of log ana‐ lytics. It concludes with guidance for architects to consider when planning and designing their own solutions to drive the full value out of log data, culminating in best practices associated with nine key questions: • What are the trends for ingest rates? • How long does log data need to be retained? • How will regulatory issues affect log analytics? • What data sources and formats are involved? • What role will changing business realities have? • What are the ongoing query requirements? • How are data-management challenges addressed? • How are data transformations handled? • What about data protection and high availability? Capturing the Potential of Log Data At its core, log analytics is the process of taking the logs generated from all over the enterprise—servers, operating systems, applica‐ tions, and many others—and deducing insights from them that power business decision making. That requires a broad and coher‐ ent system of telemetry, which is the process of PCs, servers, and other endpoints capturing relevant data points and either transmit‐ ting them to a central location or a robust system of edge analytics. Log analytics begins with collecting, unifying, and preparing log data from throughout the enterprise. Indexing, scrubbing, and nor‐ malizing datasets all play a role, and all of those tasks must be Capturing the Potential of Log Data | 3 completed at high speed and with efficiency, often to support real- time analysis. This entire life cycle and the systems that perform it