Disk IO Tuning in AIX 6.1

Total Page:16

File Type:pdf, Size:1020Kb

Disk IO Tuning in AIX 6.1 Technical Education October 2011 IBM Systems Group Disk IO Tuning in AIX 6.1 Author: Dan Braden [email protected] Presenter: Steve Nasypany IBM Advanced Technical Skills http://w3.ibm.com/support/americas/pseries © 2010 IBM Corporation © 2010 IBM Corporation Agenda The importance of IO tuning Disk basics and performance overview AIX IO stack Data layout Characterizing application IO Disk performance monitoring tools Testing disk subsystem performance Tuning 2 2011 IBM Corporation Why is disk IO tuning important? Moore's law Processors double in price performance every 18 months Disk growth Disk densities are doubling every 12 months Customers are doubling storage capacities every 12-18 months Actuator and rotational speed increasing relatively slowly Network bandwidth - doubling every 6 months Approximate CPU cycle time 0.0000000005 seconds Approximate memory access time 0.000000270 seconds Approximate disk access time 0.010000000 seconds Memory access takes 540 CPU cycles Disk access takes 20 million CPU cycles, or 37,037 memory accesses System bottlenecks are being pushed to the disk Disk subsystems are using cache to improve IO service times Customers now spend more on storage than on servers 3 2011 IBM Corporation Why is disk IO tuning important? Seagate 15k RPM/3.5" Drive Specifications +35% 450 Capacity (GB) Max Sustained 180 DR (MB/s) +15% Read Seek (ms) 73 75 3.6 -1% 3.4 2002 2010 Disk IO service time not improving compared to processors 4 2011 IBM Corporation Performance metrics Disk metrics MB/s IOPS With a reasonable service time Application metrics Response time Batch job run time System metrics CPU, memory and IO Size for your peak workloads Size based on maximum sustainable thruputs Bandwidth and thruput sometimes mean the same thing, sometimes not For tuning - it's good to have a short running job that's representative of your workload 5 2011 IBM Corporation Performance metrics Use a relevant metric for testing Should be tied to business costs, benefits or requirements Batch job run time Maximum or sufficient application transactions/second Query run time Metrics that typically are not so relevant Application transaction time if < a few seconds Metrics indicating bottlenecks CPU, memory, network, disk Important if the application metric goal isn’t met Be aware of IO from other systems affecting disk performance to shared disk If benchmarking two systems, be sure the disk performance is apples to apples and you’re not really comparing disk subsystem performance 6 2011 IBM Corporation Disk performance “ZBR” Geometry Interface type ATA SATA SCSI IO service times are predominately FC seek + rotational latency + queueing time SAS 7 2011 IBM Corporation Disk performance When do you have a disk bottleneck? Random workloads Reads average > 15 ms With write cache, writes average > 2.5 ms Sequential workloads Two sequential IO streams on one disk You need more thruput IOPS vs IO service time - 15,000 RPM disk 500 400 300 200 100 IO serviceIO time (ms) 0 25 50 75 100 125 150 175 200 225 250 275 300 325 IOPS 8 2011 IBM Corporation How to improve disk performance Reduce the number of IOs Bigger caches Application, file system, disk subsystem Use caches more efficiently No file system logging No access time updates Improve average IO service times Better data layout Reduce locking for IOs Buffer/queue tuning Use SSDs or RAM disk Faster disks/interfaces, more disks Short stroke the disks and use the outer edge Smooth the IOs out over time Reduce the overhead to handle IOs 9 2011 IBM Corporation What is %iowait? A misleading indicator of disk performance A type of CPU idle Percent of time the CPU is idle and waiting on an IO so it can do some more work High %iowait does not necessarily indicate a disk bottleneck Your application could be IO intensive, e.g. a backup You can make %iowait go to 0 by adding CPU intensive jobs Low %iowait does not necessarily mean you don't have a disk bottleneck The CPUs can be busy while IOs are taking unreasonably long times If disk IO service times are good, you aren’t getting the performance you need, and you have significant %iowait – consider using SSDs or RAM disk Improve performance by potentially reducing %iowait to 0 10 2011 IBM Corporation Solid State Disk (SSD) High performance electronic disk From 14,000 – 27,000 IOPS possible for a single SSD SSD IO bandwidth varies across Power and disk subsystems Typically small (69-177 GB) and expensive compared to HDDs Read or write IOs typically < 1 ms About the same IO service time as compared to writes to disk subsystem cache About 5-15X faster than reads from disk Positioned for high access density (IOPS/GB) random read data Implementation involves finding the best data to place on the SSDs SSDs can save disk costs by reducing the number of spindles needed When high access density data exists A mix of SSDs and HDDs is often best 11 2011 IBM Corporation SSD vs. HDD performance SSD offers up to 33x – 125x more IOPS HDD IO service time typically 5X to 40X slower* 125X 40X 33X 5X 1X 1X HDD SSD HDD SSD Access time is drive-to-drive, ignoring any caching by SAS controller 12 2011 IBM Corporation RAM disk Use system RAM to create a virtual disk Data is lost in the event of a reboot or system crash IOs complete with RAM latencies For file systems, it takes away from file system cache Taking from one pocket and putting it into another A raw disk or file system only – no LVM support # mkramdisk 16M /dev/rramdisk0 # mkfs -V jfs2 /dev/ramdisk0 mkfs: destroy /dev/ramdisk0 (yes)? y File system created successfully. 16176 kilobytes total disk space. Device /dev/ramdisk0: Standard empty filesystem Size: 32352 512-byte (DEVBLKSIZE) blocks # mkdir /ramdiskfs # mount -V jfs2 -o log=NULL /dev/ramdisk0 /ramdiskfs # df -m /ramdiskfs Filesystem MB blocks Free %Used Iused %Iused Mounted on /dev/ramdisk0 16.00 15.67 3% 4 1% /ramdiskfs 13 2011 IBM Corporation The AIX IO stack Application Application memory area caches data to avoid IO Logical file system Raw Raw disks Raw Raw LVs NFS caches file attributes JFS JFS2 NFS Other NFS has a cached filesystem for NFS clients JFS and JFS2 cache use extra system RAM VMM JFS uses persistent pages for cache JFS2 uses client pages for cache LVM (LVM device drivers) Multi-path IO driver (optional) Disk Device Drivers Queues exist for both adapters and disks Adapter Device Drivers Adapter device drivers use DMA for IO Disk subsystem (optional) Disk subsystems have read and write cache Disk Disks have memory to store commands/data Write cache Read cache or memory area used for IO IOs can be coalesced (good) or split up (bad) as they go thru the IO stack IOs adjacent in a file/LV/disk can be coalesced IOs greater than the maximum IO size supported will be split up 14 2011 IBM Corporation Synchronous vs Asynchronous IOs Definition depends on the frame of reference Programmers/application When an application issues a synchronous IO, it waits until the IO is complete Asynchronous IOs are handed off to the kernel, and the application continues, and uses the AIO facilities in AIX When a group of asynchronous IOs complete, a signal is sent to the application Allows IO and processing to run simultaneously Filesystem IO Synchronous write IOs to a file system must get to disk Asynchronous IOs only need to get to file system cache GLVM or disk subsystem mirroring Synchronous mirroring requires that writes to both mirrors complete before returing an acknowledgement to the application Asynchronous mirroring returns an acknowledgement when the write completes at the local storage Writes to remote storage are done in the same order as locally 15 2011 IBM Corporation Data layout Data layout affects IO performance more than any tunable IO parameter Good data layout avoids dealing with disk hot spots An ongoing management issue and cost Data layout must be planned in advance Changes are often painful iostat and filemon can show unbalanced IO Best practice: evenly balance IOs across all physical disks Random IO best practice: Spread IOs evenly across all physical disks For disk subsystems Create RAID arrays of equal size and RAID level Create VGs with one LUN from every array Spread all LVs across all PVs in the VG The SVC can, and XIV does do this automatically 16 2011 IBM Corporation Random IO data layout Disk subsystem 1 2 1 2 3 4 5 3 datavg 4 # mklv lv1 –e x hdisk1 hdisk2 … hdisk5 5 # mklv lv2 –e x hdisk3 hdisk1 …. hdisk4 ….. Use a random order for the hdisks for each LV RAID array LUN or logical disk PV 17 2011 IBM Corporation Data layout for sequential IO Many factors affect sequential thruput RAID setup, number of threads, IO size, reads vs. writes Create RAID arrays with data stripes a power of 2 RAID 5 arrays of 5 or 9 disks RAID 10 arrays of 2, 4, 8, or 16 disks Do application IOs equal to, or a multiple of, a full stripe on the RAID array Or use multiple threads to submit many IOs N disk RAID 5 arrays can handle no more than N-1 sequential IO streams before the IO becomes randomized N disk RAID 10 arrays can do N sequential read IO streams and N/2 sequential write IO streams before the IO becomes randomized Sometimes smaller strip sizes (around 64 KB) perform better Test your setup if the bandwidth needed is high 18 2011 IBM Corporation Data layout Best practice for VGs and LVs Use Big or Scalable VGs Both support no LVCB header on LVs (only important for raw LVs) These can
Recommended publications
  • UNIX OS Agent User's Guide
    IBM Tivoli Monitoring Version 6.3.0 UNIX OS Agent User's Guide SC22-5452-00 IBM Tivoli Monitoring Version 6.3.0 UNIX OS Agent User's Guide SC22-5452-00 Note Before using this information and the product it supports, read the information in “Notices” on page 399. This edition applies to version 6, release 3 of IBM Tivoli Monitoring (product number 5724-C04) and to all subsequent releases and modifications until otherwise indicated in new editions. © Copyright IBM Corporation 1994, 2013. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Tables ...............vii Solaris System CPU Workload workspace ....28 Solaris Zone Processes workspace .......28 Chapter 1. Using the monitoring agent . 1 Solaris Zones workspace ..........28 System Details workspace .........28 New in this release ............2 System Information workspace ........29 Components of the monitoring agent ......3 Top CPU-Memory %-VSize Details workspace . 30 User interface options ...........4 UNIX OS workspace ...........30 UNIX Detail workspace ..........31 Chapter 2. Requirements for the Users workspace ............31 monitoring agent ...........5 Enabling the Monitoring Agent for UNIX OS to run Chapter 4. Attributes .........33 as a nonroot user .............7 Agent Availability Management Status attributes . 36 Securing your IBM Tivoli Monitoring installation 7 Agent Active Runtime Status attributes .....37 Setting overall file ownership and permissions for AIX AMS attributes............38
    [Show full text]
  • SUSE Linux Enterprise Server 11 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 11 SP4
    SUSE Linux Enterprise Server 11 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 11 SP4 Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see http://www.suse.com/company/legal/ . All other third party trademarks are the property of their respective owners. A trademark symbol (®, ™ etc.) denotes a SUSE or Novell trademark; an asterisk (*) denotes a third party trademark. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xi 1 Available Documentation xii 2 Feedback xiv 3 Documentation Conventions xv I BASICS 1 1 General Notes on System Tuning 2 1.1 Be Sure What Problem to Solve 2 1.2 Rule Out Common Problems 3 1.3 Finding the Bottleneck 3 1.4 Step-by-step Tuning 4 II SYSTEM MONITORING 5 2 System Monitoring Utilities 6 2.1 Multi-Purpose Tools 6 vmstat 7
    [Show full text]
  • System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 15 SP1
    SUSE Linux Enterprise Server 15 SP1 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 15 SP1 An administrator's guide for problem detection, resolution and optimization. Find how to inspect and optimize your system by means of monitoring tools and how to eciently manage resources. Also contains an overview of common problems and solutions and of additional help and documentation resources. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see https://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xii 1 Available Documentation xiii
    [Show full text]
  • SUSE Linux Enterprise Server 12 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 12 SP4
    SUSE Linux Enterprise Server 12 SP4 System Analysis and Tuning Guide System Analysis and Tuning Guide SUSE Linux Enterprise Server 12 SP4 An administrator's guide for problem detection, resolution and optimization. Find how to inspect and optimize your system by means of monitoring tools and how to eciently manage resources. Also contains an overview of common problems and solutions and of additional help and documentation resources. Publication Date: September 24, 2021 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Copyright © 2006– 2021 SUSE LLC and contributors. All rights reserved. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the Invariant Section being this copyright notice and license. A copy of the license version 1.2 is included in the section entitled “GNU Free Documentation License”. For SUSE trademarks, see https://www.suse.com/company/legal/ . All other third-party trademarks are the property of their respective owners. Trademark symbols (®, ™ etc.) denote trademarks of SUSE and its aliates. Asterisks (*) denote third-party trademarks. All information found in this book has been compiled with utmost attention to detail. However, this does not guarantee complete accuracy. Neither SUSE LLC, its aliates, the authors nor the translators shall be held liable for possible errors or the consequences thereof. Contents About This Guide xii 1 Available Documentation xiii
    [Show full text]
  • Practical Linux Performance and Application Troubleshooting by Tanel Poder |
    Practical Linux Performance and Application Troubleshooting by Tanel Poder | https://blog.tanelpoder.com/seminar Table of Contents The training sessions of this class are split into two separate weeks: • Part 1 - Application Troubleshooting on Linux (intermediate) • Part 2 - Linux OS Troubleshooting and Performance Tuning (advanced) The Part 1 starts at higher level and aims to stay mostly at application troubleshooting and OS fundamentals level, assuming that the system is not overloaded and the kernel & lower levels of hardware/OS stack work just fine. The Part 2 then dives deeper, revisiting some of the topics at much more detailed level (including OS kernel issues) and addressing new areas too. Part 1 – Application Troubleshooting on Linux 1. Introduction: Application and OS touchpoint • How do applications interact with the OS and hardware? • How do system calls work? • Tracing process activity • Process virtual memory • Private memory allocation basics • Executable loading and memory mapping • Do we have a memory shortage? 2. Measuring Process Activity • Processes, threads, tasks • Process creation, cleanup and zombies • CPU scheduler and thread states • What does the Linux system load really mean? (It’s different from classic Unix) • Sampling & profiling any OS process activity • Application CPU usage & CPU profiling (Java, C/C++, Python etc) 3. File access and disk I/O • Everything is a file • File descriptors • Where are my log files? • Filesystems & caching • Block devices and disk I/O • Using iostat for IO subsystem performance summary © Tanel Poder 2009-2019 https://blog.tanelpoder.com 4. Process Memory usage • Process heaps, data segments • Java applications’ memory analysis (JVM) • PageCache and buffers • How much memory is really free? • Who’s using the most of physical RAM? • Who is causing swapping? • Configuring Swap space and memory overcommitting • When should I change Kernel “swappiness”? • Static HugePages, Transaprent HugePages and page size considerations 5.
    [Show full text]
  • Seebeyond ICAN Suite Installation Guide
    eInsight Enterprise Service Bus Installation Guide Release 5.0.2 SeeBeyond Proprietary and Confidential The information contained in this document is subject to change and is updated periodically to reflect changes to the applicable software. Although every effort has been made to ensure the accuracy of this document, SeeBeyond Technology Corporation (SeeBeyond) assumes no responsibility for any errors that may appear herein. The software described in this document is furnished under a License Agreement and may be used or copied only in accordance with the terms of such License Agreement. Printing, copying, or reproducing this document in any fashion is prohibited except in accordance with the License Agreement. The contents of this document are designated as being confidential and proprietary; are considered to be trade secrets of SeeBeyond; and may be used only in accordance with the License Agreement, as protected and enforceable by law. SeeBeyond assumes no responsibility for the use or reliability of its software on platforms that are not supported by SeeBeyond. SeeBeyond, e*Gate, and e*Way are the registered trademarks of SeeBeyond Technology Corporation in the United States and select foreign countries; the SeeBeyond logo, e*Insight, and e*Xchange are trademarks of SeeBeyond Technology Corporation. The absence of a trademark from this list does not constitute a waiver of SeeBeyond Technology Corporation's intellectual property rights concerning that trademark. This document may contain references to other company, brand, and product names. These company, brand, and product names are used herein for identification purposes only and may be the trademarks of their respective owners. © 2003 by SeeBeyond Technology Corporation.
    [Show full text]
  • Advanced Administration of the Gnulinux Operating System, February 2010
    Configuration, tuning and optimisation Remo Suppi Boldrito PID_00148473 GNUFDL • PID_00148473 Configuration, tuning and optimisation Copyright © 2009, FUOC. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License" GNUFDL • PID_00148473 Configuration, tuning and optimisation Index Introduction............................................................................................... 5 1. Basic aspects........................................................................................ 7 1.1. Monitoring on a UNIX System V ............................................... 8 1.2. Optimising the system ................................................................ 15 1.3. General optimisations ................................................................. 19 1.4. Additional configurations ........................................................... 20 1.5. Monitoring .................................................................................. 24 Activities...................................................................................................... 33 Bibliography............................................................................................... 34 GNUFDL • PID_00148473 5 Configuration, tuning and optimisation
    [Show full text]
  • Practical Resource Monitoring for Robust High Throughput Computing
    Practical Resource Monitoring for Robust High Throughput Computing Gideon Juve1, Benjamin Tovar2, Rafael Ferreira da Silva1, Casey Robinson2 Douglas Thain2, Ewa Deelman1, William Allcock3, Miron Livny4 1University of Southern California, Information Sciences Institute, Marina Del Rey, CA, USA fgideon,rafsilva,[email protected] 2University of Notre Dame, Notre Dame, IN, USA fdthain,crobins9,[email protected] 3Argonne National Laboratory [email protected] 4University of Wisconsin Madison, Madison, WI, USA [email protected] ABSTRACT ing statistics. Robust high throughput computing requires effective mon- Efficient and robust resource provisioning and scheduling itoring and enforcement of a variety of resources including strategies are required to handle this category of applica- CPU cores, memory, disk, and network traffic. Without ef- tions. Scheduling and provisioning algorithms typically as- fective monitoring and enforcement, it is easy to overload sume that resource usage information such as wall time, file machines, causing failures and slowdowns, or underload ma- size, and memory requirements, are all available in advance chines, which results in wasted opportunities. This paper or can be reliably estimated [2,4,1, 42, 23], but in prac- explores how to describe, measure, and enforce resources tice this information is rarely available. As middleware layer used by computational tasks. We focus on tasks running get information from the user, without detailed resource in- in distributed execution systems, in which a task requests formation, it is virtually impossible to make even a simple the resources it needs, and the execution system ensures the decision such as how many tasks to run simultaneously on a availability of such resources.
    [Show full text]
  • Oracle Solaris Administration Common Tasks
    Oracle® Solaris Administration: Common Tasks Part No: 821–1451–11 December 2011 Copyright © 1998, 2011, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms set forth in the applicable Government contract,and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007). Oracle America, Inc., 500 Oracle Parkway, Redwood City, CA 94065.
    [Show full text]
  • Linux Performance Analysis in 60,000 Milliseconds
    67 More Next Blog» [email protected] New Post Sign Out Monday, November 30, 2015 Links Linux Performance Analysis in 60,000 Milliseconds Netflix US & Canada Blog Netflix America Latina Blog You login to a Linux server with a performance issue: what do you check in the first minute? Netflix Brasil Blog At Netflix we have a massive EC2 Linux cloud, and numerous performance analysis tools to monitor and Netflix Benelux Blog investigate its performance. These include Atlas for cloud­wide monitoring, and Vector for on­demand instance analysis. While those tools help us solve most issues, we sometimes need to login to an instance Netflix DACH Blog and run some standard Linux performance tools. Netflix France Blog In this post, the Netflix Performance Engineering team will show you the first 60 seconds of an optimized Netflix Nordics Blog performance investigation at the command line, using standard Linux tools you should have available. Netflix UK & Ireland Blog Netflix ISP Speed Index First 60 Seconds: Summary In 60 seconds you can get a high level idea of system resource usage and running processes by running the Open positions at Netflix following ten commands. Look for errors and saturation metrics, as they are both easy to interpret, and then Netflix Website resource utilization. Saturation is where a resource has more load than it can handle, and can be exposed either as the length of a request queue, or time spent waiting. Facebook Netflix Page Netflix UI Engineering uptime dmesg | tail RSS Feed vmstat 1 mpstat -P ALL 1 pidstat 1 iostat -xz 1 About the Netflix Tech Blog free -m sar -n DEV 1 This is a Netflix blog focused on sar -n TCP,ETCP 1 technology and technology issues.
    [Show full text]
  • Problem Reporting and Analysis Linux on System Z - How to Survive a Linux Critical Situation
    Problem Reporting and Analysis Linux on System z - How to survive a Linux Critical Situation Sven Schuetz Linux on System z Development and Service [email protected] 1 © 2011 IBM Corporation IBM Live Virtual Class – Linux on System z Agenda . Introduction . How to help us to help you . Systems monitoring . How to dump a Linux on System z . Real Customer cases 2 © 2011 IBM Corporation IBM Live Virtual Class – Linux on System z Introductory remarks . Problem analysis looks straight forward on the charts but it might have taken weeks to get it done. A problem does not necessarily show up on the place of origin . The more information is available, the sooner the problem can be solved, because gathering and submitting additional information again and again usually introduces delays. This presentation can only introduce some tools and how the tools can be used, comprehensive documentation on their capabilities is to be found in the documentation of the corresponding tool. Do not forget to update your systems 3 © 2011 IBM Corporation IBM Live Virtual Class – Linux on System z Describe the problem . Get as much information as possible about the circumstances: – What is the problem? – When did it happen? (date and time, important to dig into logs ) – Where did it happen? One or more systems, production or test environment? – Is this a first time occurrence? – If occurred before: how frequently does it occur? – Is there any pattern? – Was anything changed recently? – Is the problem reproducible? . Write down as much information as possible about the problem! 4 © 2011 IBM Corporation IBM Live Virtual Class – Linux on System z Describe the environment .
    [Show full text]
  • Optimizing AIX 7 Performance: Part 1, Disk I/O Overview and Long-Term Monitoring Tools (Sar, Nmon, and Topas) Martin Brown October 12, 2010 Ken Milberg
    Optimizing AIX 7 performance: Part 1, Disk I/O overview and long-term monitoring tools (sar, nmon, and topas) Martin Brown October 12, 2010 Ken Milberg Learn more about configuring and monitoring AIX 7 based on the investigations of AIX 7 beta compared to the original articles based on AIX 5L. The article covers the support for direct I/O, concurrent I/O, asynchronous I/O, and best practices for each method of I/O implementation. This three-part series on the AIX® disk and I/O subsystem focuses on the challenges of optimizing disk I/O performance. While disk tuning is arguably less exciting than CPU or memory tuning, it is a crucial component in optimizing server performance. In fact, partly because disk I/O is your weakest subsystem link, you can do more to improve disk I/O performance than on any other subsystem. View more content in this series Introduction A critical component of disk I/O tuning involves implementing best practices prior to building your system. Because it is much more difficult to move things around when you are already up and running, it is extremely important that you do things right the first time when planning your disk and I/O subsystem environment. This includes the physical architecture, logical disk geometry, and logical volume and file system configuration. When a system administrator hears that there might be a disk contention issue, the first thing he or she turns to is iostat. iostat, the equivalent of using vmstat for your memory reports, is a quick and dirty way of getting an overview of what is currently happening on your I/O subsystem.
    [Show full text]