SUSE BU Presentation Template 2014

TUT7317 A Practical Deep Dive for Running High-End, Enterprise Applications on SUSE Linux Holger Zecha Senior Architect REALTECH AG [email protected] Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 2 Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 3 About REALTECH 1/2 REALTECH Software REALTECH Consulting . Business Service Management . SAP Mobile . Service Operations Management . Cloud Computing . Configuration Management and CMDB . SAP HANA . IT Infrastructure Management . SAP Solution Manager . Change Management for SAP . IT Technology . Virtualization . IT Infrastructure 4 About REALTECH 2/2 5 Our Customers Manufacturing IT services Healthcare Media Utilities Consumer Automotive Logistics products Finance Retail REALTECH Consulting GmbH 6 Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 7 The Inspiration for this Session • Several performance workshops at customers • Performance escalations at customer who migrated from UNIX (AIX, Solaris, HP-UX) to Linux • Presenting the experiences made at these customers in this session • Preventing the audience from performance degradation caused from: – Significant design mistakes – Wrong architecture assumptions – Having no architecture at all 8 Performance Optimization The False Estimation Upgrading server with CPUs that are 12.5% faster does not improve application performance of 12.5% •Identify the layer where you lose your performance – i.e. Server ratio of overall response time is on 37% from 500ms which is 185ms – No additional parallelization necessary, because transactions are not waiting for CPU cycles – Also wait time in SAN and network layer – CPU exchange from CPUs with 3.2GHz to CPUs with 3.6GHz clock speed – CPU performance improvement of 12.5% – Transaction improvement of 23.13ms now 476.87ms • Improvement of 4.63% and not 12.5% 9 Clarification What is a High End Enterprise Application? • The (high) end of an application is defined by the smallest entity which can not be sliced and scaled out • An enterprise application is defined by its importance for the company's ongoing operation and therefore the company’s revenue – Web shop and its related backend systems of an online retailer – Supply Chain Management system from an automotive supplier 10 What this Session Covers and What Not • Frameworks for running Linux not be covered – I.e. SUSE Cloud, vSphere cluster – Evaluating a Framework is part of a Proof of Concept • Technical components which are needed for design principles will be covered – KVM – XEN – File system layouts and file systems itself – Storage architectures – Memory configuration – Network throughput optimization 11 Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 12 Design Principles in Theory • Slice and dice your system into appropriate layers • Do a proper architecture for every layer and make sure you can reuse it • Bring the layers together • The design principles that work well for a High End Enterprise Application will also work for a Web Server • The design principles that work well for a Web Server will not necessarily work well for a High End Enterprise Application 13 Considerations in Core Design Illustrated using the IO Scheduler as Example • Peak IOPS needed and how to ensure that we can get them, without to much administrative overhead in file system layout • Avoid “hot spots” • Illustration: IO schedulers and their impact on IO performance – CFQ scheduler vs NOOP scheduler • Architecture examples: – Linux Server running on VMware – Linux Server running in Amazon EC2 – Linux Server running on bare metal 14 Avoiding Hot Spots The IO Scheduler Example • NOOP Scheduler: • Block device: 1 IO thread per block lvol is no block device device DATA (lvol) • Stripe size: • Striping: Best results with sdc1, sdd1, sdc2, sdd2,… physical extent size • Appropriate disk size to • Use partitions to reduce administrative effort increase performance sdc sdd VG3 sdc1 sdc2 sdd1 sdd2 sdc3 sdc4 sdd3 sdd4 15 Linux Server Core Design The Oracle Example 16 Storage Layer Core Design VG1 VG2 VG3 sda1 sdb1 sdc LUN 1 LUN 2 LUN 3 Partition 1 Partition 1 Partition 1 FC HBA1 FC HBA2 Physical Storage (Array Group 1) Physical Storage (Array Group 2) Physical Storage (Array Group 3) Concept and Visio by Manuel Padilla and Holger Zecha LUN 1 LUN 2 LUN 3 LUN 4 17 VMware Core Design 1st Usage of Storage Layer Core Design 18 Bringing the Layers Together The VMware Example 19 VMware Example • SAN layer is part of VMware infrastructure • Mapping between SAN and Linux disks based on VMware storage infrastructure • Use virtualization solution specific optimization for throughput optimization – Para virtualized SCSI controllers – Para virtualized NICs – … 20 Bringing the Layers Together The Amazon EC2 Example • Why isn’t here an architecture diagram? • Because server layer is the only tier we have access to and therefore target for performance optimization • No SCSI controllers – disks get mapped directly into guest OS (hwinfo | grep xvdb) – E: DEVPATH=/devices/xen/vbd-51792/block/xvdf – E: DEVNAME=/dev/xvdf • Spread needed IOPS across sufficient disks and avoid hot spots – Be aware of IO scheduler behavior – Use appropriate striping 21 AMAZON Example 1/2 • AMAZON guarantee: Dedicated number of IOPS per volume (i.e. 3.000 IOPS) • Question: What does this mean for a 1 TB database which needs 20.000 IOPS in peak OLTP operations? • Answer: We need at least 7 volumes • Considerations: – Do appropriate striping across all volumes to get access of all 21.000 IOPS – NOOP IO scheduler is no bottleneck (7*145 GB disks), because we have enough disks which have one IO thread, therefore no additional partitions needed! 22 AMAZON Example 2/2 • Question: The database size is now 10TB, what changes? • Answer: Use 10*1 TB disks and create 4 partitions on each disk to eliminate scheduler hot spots • Considerations: – NOOP IO scheduler can become bottleneck – Create 4 partitions a 250GB on each disk – Do appropriate striping across all partitions on all volumes to access all IOPS equally sdc1, sdd1, sde1, sdf1, sdg1, sdh1, sdi1, sdj1, sdk1, sdl1, sdc2, sdd2, ….. 23 Bringing the Layers Together Bare Metal Example – 2nd Usage of Storage Layer Core Design sdc sdd VG1 VG2 VG3 sdc1 sdc2 sdd1 sdd2 sda1 sdb1 LUN 1 LUN 2 Partition 1 Partition 1 sdc3 sdc4 sdd3 sdd4 FC HBA1 FC HBA2 Physical Storage (Array Group 1) Physical Storage (Array Group 2) Physical Storage (Array Group 3) Concept and Visio by Manuel Padilla and Holger Zecha LUN 1 LUN 2 LUN 3 LUN 4 Also no hot spots because of Fiber Channel Multi Pathing CFQ scheduler avoids IO scheduler bottleneck 24 Bare Metal Example • Question: Do we need 2 disks for the data file system? • Answer: No – IOPS from storage is completely useable in one LUN – IO scheduler is no longer a bottleneck because of used CFQ scheduler – CFQ scheduler has one scalable IO queue per process • Considerations: – Take data integrity into account! 25 Table of Content • About REALTECH • About this session • Design principles • Different layers which need to be considered 26 What Layers Do We Have? • SAN/NAS • Network • Server hardware • Virtualization solutions • Server configuration • Application 27 SAN/NAS What is important in this layer? • IOPS – Tiered storage – Traditional layout • Data integrity – Depends on the technology used from your storage vendor • Redundant access paths – Wherever possible use redundant access paths for load balancing and high availability – Multipath on server level using Device Mapper – Server virtualization layer – Storage virtualization 28 Network How to Optimize Network Throughput? • Jumbo frames or standard 1518 byte frames – It depends on your application. Measurements for different applications show sometimes advantages of using jumbo frames. • Distribute your network traffic onto dedicated NICs – Separation of user, data and backup LAN • If you want to use VM relocation also use a dedicated LAN • Use bonding if possible 29 Server Hardware • The number of CPU cores and the clock speed do not necessarily guarantee a linear performance gain • Take care of crucial features of single components – CPU – implemented virtualization features – Memory – error correction code • New: Isolation features – i.e. physical hardware partitioning on Fujitsu PRIMEQUEST servers – 2 socket per board (30 cores) – 3TB RAM per board – 4 boards in total – Bare metal and virtualization on one physical server 30 Virtualization Overview 31 Virtualization Solutions 1/5 Hardware Partitions, KVM, XEN, VMware, Hyper-V, Containers • Number of supported CPUs for guest systems is no performance indicator – Take NUMA into account – Take maturity of virtualization solution into account • Hardware Partitions • Hypervisor full virtualized – Every guest OS system call is being captured and translated into a system call of the host – System call translation from guest OS to host needs some overhead – Every OS can be run on full virtualized servers 32 Virtualization Solutions 2/5 Hardware Partitions, KVM, XEN, VMware, Hyper-V, Containers • Hypervisor para virtualized – Certain guest OS system calls will bypass the virtualization layer and access the host hardware directly – Para virtualized guests need dedicated para virtualized kernel for guest OS and/or drivers (i.e. VMware tools)

SUSE BU Presentation Template 2014

On the Performance Variation in Modern Storage Stacks

Vytvoření Softwareově Definovaného Úložiště Pro Potřeby Ukládání a Sdílení Dat V Rámci Instituce a Jeho Zálohování Do Datových Úložišť Cesnet

Buffered FUSE: Optimising the Android IO Stack for User-Level Filesystem

Redhat Virtualization Tuning and Optimization Guide

Virtualization Best Practices

One Big PDF Volume

Improving Block-Level Efficiency with Scsi-Mq

Xen Cloud Platform Administrator's Guide Release 0.1 0.1

Synergistically Coupling of Solid State Drives and Hard Disks for Qos-Aware Virtual Memory Ke Liu Wayne State University

Automatic I/O Scheduler Selection Through Online Workload Analysis

Tuning Linux OS on IBM System P the POWER of Innovation June 2007

Linux on System Z - Disk I/O Performance – Part 1