Optimizing AIX 7 Performance: Part 1, Disk I/O Overview and Long-Term Monitoring Tools (Sar, Nmon, and Topas) Martin Brown October 12, 2010 Ken Milberg
Total Page:16
File Type:pdf, Size:1020Kb
Optimizing AIX 7 performance: Part 1, Disk I/O overview and long-term monitoring tools (sar, nmon, and topas) Martin Brown October 12, 2010 Ken Milberg Learn more about configuring and monitoring AIX 7 based on the investigations of AIX 7 beta compared to the original articles based on AIX 5L. The article covers the support for direct I/O, concurrent I/O, asynchronous I/O, and best practices for each method of I/O implementation. This three-part series on the AIX® disk and I/O subsystem focuses on the challenges of optimizing disk I/O performance. While disk tuning is arguably less exciting than CPU or memory tuning, it is a crucial component in optimizing server performance. In fact, partly because disk I/O is your weakest subsystem link, you can do more to improve disk I/O performance than on any other subsystem. View more content in this series Introduction A critical component of disk I/O tuning involves implementing best practices prior to building your system. Because it is much more difficult to move things around when you are already up and running, it is extremely important that you do things right the first time when planning your disk and I/O subsystem environment. This includes the physical architecture, logical disk geometry, and logical volume and file system configuration. When a system administrator hears that there might be a disk contention issue, the first thing he or she turns to is iostat. iostat, the equivalent of using vmstat for your memory reports, is a quick and dirty way of getting an overview of what is currently happening on your I/O subsystem. While running iostat is not an inappropriate reaction at all, the time to start thinking about disk I/O is long before tuning becomes necessary. All the tuning in the world will not help if your disks are not configured appropriately for your environment from the beginning. Furthermore, it is extremely important to understand the specifics of disk I/O and how it relates to AIX® and your System p™ hardware. When it comes to disk I/O tuning, generic UNIX® commands and tools help you much less than specific AIX tools and utilities that have been developed to help you optimize your native AIX disk I/O subsystem. In this article, we will define and discuss the AIX I/O stack and correlate it to © Copyright IBM Corporation 2010 Trademarks Optimizing AIX 7 performance: Part 1, Disk I/O overview and Page 1 of 13 long-term monitoring tools (sar, nmon, and topas) developerWorks® ibm.com/developerWorks/ both the physical and logical aspects of disk performance. We will discuss direct, concurrent, and asynchronous I/O: what they are, how to turn them on, and how to monitor and tune them. We will also introduce some of the long-term monitoring tools that you should use to help tune your system. You might be surprised to hear that iostat is not one of the tools recommended to help you with long-term gathering of statistical data. This article looks at the support and changes present in a beta release of AIX 7, including the ways in which the configuration of the different subsystems has changed. The main changes in AIX 7 further simplify the operation and configuration of many of the I/O subsystems, work that had originally been started in AIX 6. The result is that many of the different I/O subsystems no longer need to be enabled and configured. Instead, they are supplied in a pre-configured state and are automatically enabled and started when an application requests that functionality. The article also concentrates on changes that will help identify and improve the subsystem you are looking to tune. The best time to start monitoring your systems is when you first put your system in production, and it is running well (rather than waiting until your users are screaming about slow performance). You really need to have a baseline of what the system looked like when it was behaving normally to analyze data when it is presumably not performing adequately. When making changes to your I/O subsystem, make these changes one at a time so that you will be able to assess fully the impact of your change. To assess that impact, you'll be capturing data using one of the long-term monitoring tools recommended in this article. Disk I/O overview It shouldn't surprise you that the slowest operation for running any program is the time actually spent on retrieving the data from disk. This all comes back to the physical component of I/O. The actual disk arms must find the correct cylinder, the control needs to access the correct blocks, and the disk heads have to wait while the blocks rotate to them. The physical architecture of your I/O system should be understood prior to any work on tuning activities for systems, since all the tuning in the world won't help a poorly architected I/O subsystem that consists of a slow disk or inefficient use of adapters. Figure 1 illustrates how tightly integrated the physical I/O components relate to the logical disk and its application I/O. This is what is commonly referred to as the AIX I/O stack. Optimizing AIX 7 performance: Part 1, Disk I/O overview and Page 2 of 13 long-term monitoring tools (sar, nmon, and topas) ibm.com/developerWorks/ developerWorks® Figure 1. The AIX I/O stack You need to be cognizant of all the layers when tuning, as each impacts performance in a different way. When first setting up your systems, start from the bottom (the physical layer) as you configure your disk, the device layer, its logical volumes, file systems, and the files and application. We can't emphasize enough the importance in planning your physical storage environment. This involves determining the amount of disk, type (speed), size, and throughput. One important challenge with storage technology to note is that while storage capabilities of disk are increasing dramatically, the rotational speed of the disk increases at a much slower pace. You must never lose sight of the fact that while RAM access takes about 540 CPU cycles, disk access can take 20 million CPU cycles. Clearly, the weakest link on a system is the disk I/O storage system, and it's your job as the system administrator to make sure it doesn't become even more of a bottleneck. As alluded to earlier, poor layout of data affects I/O performance much more than any tunable I/O parameter. Looking at the I/O stack helps you to understand this, as Logical Volume Manager (LVM) and disk placement are closer to the bottom than the tuning parameters (ioo and vmo). Now let's discuss some best practices of data layout. One important concept is making sure that your data is evenly spread across your entire physical disk. If your data resides on only a few spindles, what is the purpose of having multiple logical unit numbers (LUNs) or physical disks? If you have a SAN or another type of storage array, you should try to create your arrays of equal size and type. You should also create them with one LUN for each array and then spread all your logical volumes across all the physical volumes in your Volume Group. As stated previously, the time to do this is when you first configure your system, as it is much more cumbersome to fix I/O problems than memory or CPU problems, particularly if it involves moving data around in a production environment. You also want to make certain that your mirrors are on separate disks and adapters. Databases pose separate, unique challenges; so, if possible, Optimizing AIX 7 performance: Part 1, Disk I/O overview and Page 3 of 13 long-term monitoring tools (sar, nmon, and topas) developerWorks® ibm.com/developerWorks/ your indexes and redo logs should also reside on separate physical disks. The same is true for temporary tablespaces often used for performing sort operations. Using high-speed adapters to connect the disk drives are extremely important, but you must make certain that the bus itself does not become a bottleneck. To prevent this from happening, make sure to spread the adapters across multiple buses. At the same time, do not attach too many physical disks or LUNs to any one adapter, as this also significantly impacts performance. The more adapters that you configure, the better, particularly if there are large amounts of heavily utilized disk. You should also make sure that the device drivers support multi-path I/O (MPIO), which allows for load balancing and availability of your I/O subsystem. Direct I/O Let's return to some of the concepts mentioned earlier, such as direct I/O. What is direct I/O? First introduced in AIX Version 4.3, this method of I/O bypasses the Virtual Memory Manager (VMM) and transfers data directly to disk from the user's buffer. Depending on your type of application, it is possible to have improved performance when implementing this technique. For example, files that have poor cache utilization are great candidates for using direct I/O. Direct I/O also benefits applications that use synchronous writes, as these writes have to go to disk. CPU usage is reduced because the dual data copy piece is eliminated. This copy occurs when the disk is copied to the buffer cache and then again from the file. One of the major performance costs of direct I/ O is that while it can reduce CPU usage, it can also result in processes taking longer to complete for smaller requests.