Self-Service Monitoring Through Versioned Infrastructure and Configuration As Code

Self-Service Monitoring Through Versioned Infrastructure and Configuration As Code

NADOG Self-Service Monitoring through versioned Infrastructure and Configuration as Code February 24th, 2021 Self-Service Monitoring through versioned Infrastructure and Configuration as Code ABOUT ME Carlos Munoz Robles Global e2e Monitoring Lead [email protected] https://www.linkedin.com/in/carlosmur What I love about my current job position as a Global e2e monitoring lead is the chance to foster the DevOps culture across Allianz, and make impact on our developer’s day-to- day work, offering them a state of the art solution in order to apply the DevOps principles in an easier way. © Copyright Allianz 2 MONITORING LEGACY © Copyright Allianz 3 Self-Service Monitoring through versioned Infrastructure and Configuration as Code COMPLEX MONITORING ECOSYSTEM DCS – HC DCS- HC DCS AZ AT DCS - IBM AGCS on prem public cloud Mainframe AZ Applications Nagios Nagios Cloud watch System AZ Applications AZ Italy Tivoli vROPS Azure automation, Prometheus/ Grafana / Netcool Icinga AZELK Applications Nagios monitoring Netview, ZIS Telematics Prometheus/ Grafana / AZELK Applications Planed: Planed Nagios Prometheus/ Grafana / Applications AZELK Applications AZ UK Infrastructure ZENOSS ZENOSS Prometheus/ Grafana / - AZ Tech Branches ELK Prometheus/ Grafana / BAC AZ FR ELK OE IT Network devices Network QoS Tivoli V5 Traps & metrics FPI Oracle Zabbix Grafana RFP/PoC ongoing Exadata ELK netscout …… APM APM Filenet Oracle Dynatrace Appdymamics (AZ (AZD, AZ FR, AZ Tech, AZ Australia) Nagios Cloud control AZ Tech aplications UK, AZ Italy, Euler Fileservice APM (DFS / File exchange) and platfroms Hermes) Availability by central New Dynatrace Custom scripts / Grafana APM Prometheus / Grafana New Relic Active Directory AVC AVC Infra Service SCOM (dedicated) SAP / SAP Basis One Web / one eGi (by DXC) Nextthink marketing SAP Solution manager Workplace Icinga GIAM Air Watch Zabbix Zabbix Security Hadoop ApplicationsGlobal & Platforms platforms Cloudera Security monitoring Customer Platforms AZ Security tool © Copyright Allianz landscape: Archsight, 4 Splunk, Qualys, ……… Self-Service Monitoring through versioned Infrastructure and Configuration as Code MOTIVATION The limitations in monitoring… … have been tackled with the implementation of an e2e monitoring Too many outages with impact on customers and Increase service quality by continuously improving their business reliability and stability No correlation inbetween technologies areas and Represent monitoring information in visualization in a single pane of glass which covers multidimensional, simplified and customized the entire vertical stack dashboards Complex technology landscape with isolated Apply predictive analytics to enable monitoring focused on specific service areas automated actions for incident handling and (Accenture, IBM, …) prevention Mostly reactive alerting only once the issue Improve root cause analysis and incident occurs resolution time by correlating all components Centrally consolidate all monitoring Impact on quality, cost and customer information and connect to the CMDB satisfaction © Copyright Allianz 5 Self-Service Monitoring through versioned Infrastructure and Configuration as Code COMMON DISADVANTAGES PANE OF GLASS? ROLLOUTS? CONFIGURATION? OVERLAP? AUTOMATION? Complex and granular tool Are done manually Configuration is not traceable Different tools to cover Automation is missing stack the same use cases © Copyright Allianz 6 Self-Service Monitoring through versioned Infrastructure and Configuration as Code BREAKDOWN INTO SUBSTREAMS Full -User Experience Stack Application & Infrastructure Monitoring End Business Transactions Application Performance Monitoring (APM) Application Middleware Establish e2e monitoring as a service Database Operating System Infrastructure Monitoring Server (BM / VM / CTR) Storage, Backup, etc. Network Monitoring Network Cloud . -prem On © Copyright Allianz 7 Self-Service Monitoring through versioned Infrastructure and Configuration as Code GLOBAL E2E MONITORING – 1ST ITERATION Enrichment of CIs GCCC AZ Tech MZ Dashboards Event Management & AZD MZ OEx MZ Support (24/7) Get list of components affected CMDB (Dependencies between components) Forward Critical Events ITOM (Service Discovery & Mapping) Shared services MZ Create new Incident Multi-tenancy managed by Management zones (MZ) Ship events to the Central Event Management EVENTS NetCool BAC vROps © Copyright Allianz LOGICAL & CLOUD INFRASTRUCTURE MONITORING 8 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code DESIRED STATE • Maintainable • Understandable • Reproducible • Traceable • Versioned • Failure recovery o Roll-back configuration o Understand what happened • Avoid manual rollouts © Copyright Allianz 9 SELF-SERVICE PORTAL More than a simple ordering tool © Copyright Allianz 10 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code SELF-SERVICE TOOLCHAIN Monitoring User Personal Configuration Self-Service Inventory Jenkins Playbook / Module © Copyright Allianz 11 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code PERSONAL CONFIGURATION – DATA MODEL Who is a user in context of MonitoringAsAService? stage: prod tenant_id: grh73865 oes: az-it: Enterprise structure: teams: map enterprise structure absi: cost-center: 1********8 user-administration: l top level: Operational Entity (67) admin-users: - name - [email protected] applications: l sub level: Team (Service) - name: cisl domain: to customer inventory - team name - cisl.allianz.it - cost-center id id: '623262027030619284' az-de: - users/members teams: (developers/owners/admins) actuarialplatforms: cost-center: 1********3 user-administration: admin-users: A user corresponds to an entry of the teams of an oe. - [email protected] applications: - name: unify_lh Each user‘s Agent must be tagged with: domains: - unifylife-prod-eu id: '4953220645034211986' pki: --set-host-group <oe>-<team>-<identifier> … © Copyright Allianz 12 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code AFTER 6 MONTHS… PREPRODUCTION PRODUCTION • 316 applications • 27199 services • 66135 processes • 7392 hosts • 18 datacenters • 2145 users onboarded © Copyright Allianz 13 PERSONAL CONFIGURATION © Copyright Allianz 14 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code WHY DO WE NEED TO TALK ABOUT THIS? What about monitored entities not configurable via the User has full control over Agent properties: dev Agent? - logical grouping of hosts (by adding names/tags) these are for example: - stage selection (prod/dev) backend prod - applications - turn monitoring on/off frontend prod - synthetic monitors („faked“ network access to apps) - switch between full-stack/infra-only monitoring az-tech-globalmonitoring-eag - Cloud APIs - processes/services running on unmonitored hosts Usually only configurable via the Web GUI. What about data privacy? - users want to keep alerts/problems/root causes private Permission issue - users want or even have to restrict their metrics‘ Dynatrace‘s architecture doesn‘t provide visibility (eg. request logs) → GDPO - neither GUI access - users want to hide their infrastructure (number of hosts, Solution - nor API access computation resources, databases, …) limited by visibility filters. - sensitive credentials must be kept 100% secret! ConfigurationAsCode - user‘s admin team may want to define maintenance Block settings GUI/API access. windows where specific alerting is turned off + GitOps + Automation Make everything invisible by default for all users! = MonitoringAsAService Only explicitly set visibility filters can reveal information to explicitly onboarded user groups. © Copyright Allianz 15 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code CONFIGURATION AS CODE? GITOPS? Treat the application‘s configuration as if it was code. © Copyright Allianz 16 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code CONFIGURATION AS CODE? Treat the application‘s configuration as if it was code. Advantages l Human readable l Comprehensible l Transparent l Versionable (Git) l Automizable © Copyright Allianz 17 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code CONFIGURATION AS CODE? Treat the application‘s configuration as if it was code. Advantages l Human readable l Comprehensible l Transparent l Versionable (Git) l Automizable Target groups l Managers l Service owners l DevOps engineers l Developers l Operators © Copyright Allianz 18 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code CONFIGURATION AS CODE? Treat the application‘s configuration as if it was code. Advantages l Human readable l Comprehensible l Transparent l Versionable (Git) l Automizable Target groups l Managers l Service owners l DevOps engineers l Developers l Operators © Copyright Allianz 19 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code CONFIGURATION AS CODE? Treat the application‘s configuration as if it was code. Advantages l Human readable l Comprehensible l Transparent l Versionable (Git) l Automizable Target groups l Managers l Service owners l DevOps engineers l Developers l Operators © Copyright Allianz 20 Self-Service Toolchain/Monitoring through versioned Infrastructure and Configuration as Code CONFIGURATION AS CODE? Treat the application‘s configuration

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    44 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us