IBM Data Server Manager

SQL Statement Monitoring and Historical Analysis

Version 2.0 – August 2018

Peter Kohlmann

kohlmann@ca..com

© Copyright IBM Corporation 2018 IBM

Route 100 Somers, NY 10589

Produced in the United States of America March 2017

All Rights Reserved DB2, IBM, the IBMlogo, InfoSphere, and Rational are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries or both. Other company, product or service names may be trademarks or service marks of others. References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates. All statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice. The information contained in this document is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained in this document, it is provided “as is” without warranty of any kind, express or implied. In addition, this information is based on current plans and strategy, which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise relating to, this document or any other documents. Nothing in this document is intended to, nor shall have the effect of, creating any warranties or representations from IBM Software.

SQL Statement Monitoring and Historical Analysis

SQL statements are at the core of a relational database. A history of the when and how those statements were executed can be the key solving problems and improving performance. Understanding which statements were running at any given time can give a broad group of user roles insight into what was driving the behavior of the database. For more technically focused DBA's or Developers, if your users complain that the database was running slowly, you will eventually need to know the work going on at that time. Even if you determine that the problem was due to locking contention, or a sudden spike in CPU usage, or that the DB was running out of free log space, you need to know which statements were running at the same time the problem occurred. For business focused Data Administrators, you can also use a complete statement history to measure data access and resource use by business or application area. You can use that to charge for use.

Having insight into the statements the database executing right now, can help you identify runaway queries that are locking up database resources. Understanding which statements use the most system resources, or that run most often, will let you understand which statements to focus on for the biggest impact. Having a detailed history of all the statements that have run on your system lets you:

• Identify applications and users based on SQL

• See exactly which statements need your attention for tuning or optimization

• Allow you to audit actions against the database

• Chargeback individual users, departments or application owners for what they use

Data Server Manager supports different kinds of statement monitoring. Each has its own purpose and strength. Understanding how each works and how to use Data Server Manager to get the most out of each will help you be more successful in determining the root cause of problems, see opportunities for improvement and even to build custom systems for audit or charge-back.

Db2 Warehouse Private and the IBM Integrated Analytics System (IIAS) both use all the capabilities found in Data Server Manager to monitor the history of the SQL statements. The main difference is that these physical and virtual appliance offerings are preconfigured.

In-flight Statements

Data Server Manager can show you which statements are running right now on your database.

Once a statement is finished it will disappear from this list, so the vast majority of statements will complete before you can even ask to see the details of the statement. This is best to manage statements that take minutes or hours to complete and is especially useful in dealing with runaway statements that never end or that tie up the whole database.

Select Monitor->Database and then select Realtime and In-Flight Executions.

From the list of currently running statements you can:

• See details for the running statement

• Create a visual explanation of the access plan the statement will use

• Select the query for tuning

• Cancel the statement or force the application running the statement if the statement or the application is running out of control •

The columns that appear above are only a small subset of the database collected and available. Select the show or hide column icon beside the search magnifying glass to add or remove the columns in the table. You can even export the list of statements to a CSV file.

Statements that are in the Package Cache now

If you need to analyze statements that were running to help with problem determination or better understand what is generating the work and using resources, a good place to start is by viewing monitoring information in the package cache. The package cache is an area in memory where DB2 stores static and dynamic queries, access plans, and information about the execution of those queries. It is that information that Data Server Manager can help you access and to mine for added insight.

Select Monitor->Database and then select Realtime and Package Cache.

There are over thirty columns of information available for each statement found in the package cache. However the easiest way to find the statements you want to see is to select Show by Highest. This allows you to zero in on the most expensive statements depending on the system you are investigating. If you have a suspected locking problem, select Show by highest lock wait time. If you are running out of resources, select Show by highest CPU time.

If you identify a statement you needs attention, often the next step is to show the details for that statement. Select the statement and then View Details. Often the most useful part of the details screen is Time Spent and I/O. You will quickly see if the time spent on the statement is on sorting, locking, or retrieving data from disk.

If you need to dig deeper you can explain the statement to get a better understand of how the data server will execute the statement and construct a result set. Select the statement and then select Explain.

Package Cache History

One of the most powerful features of Data Server Manager is its ability to collect historical information in its own history repository database. It acts like a flight recorder for a and lets you look back in time and pin point what was happening when a problem occurred. It not only lets you more effectively determine the root cause of problems but helps you to eliminate them in the future.

You need to have Data Server Manager Enterprise Edition and you have to set up DSM to use a DB2 database as a repository for the historical information. For more information on setting up the repository refer to the Data Server Manager Knowledge Center.

Every few minutes Data Server Manager will connect to the database you are monitoring and take a copy of all the information for the most active SQL statements in the cache. By default DSM collects the top 20 statements for each metric.

To see the collected history information, select Monitor->History and Package Cache.

The first thing different in the History view, that you will notice, is a time slider across the top of the page. You can use the slider, as well as the drop down beside the History button, to select the timeframe you want to focus on. The values in the list are the total values or the average values for each statement that appears in the package cache during the period you have defined in the time slider.

You can narrow down your search to a very narrow window of time for specific problem determination, or you can look at the history over the last month to understand which statements are using the most resources overall.

Individual Statement History using the Event Monitor

If you need to see a complete history of all the statements that ran on your database and exactly when they ran, you need more information than what you can find in the Package Cache. Less frequently used statements are automatically removed from the package cache to keep the most frequently executed statements in memory. So you cannot rely on the package cache to have a complete picture of everything going on in your database over time.

For a complete picture you need monitor each event as it happens. Event monitoring can create an individual historical record in your database each time a statement runs.

Select Monitor->Database and then select History and Individual Executions.

The first thing you may notice is that unlike Package Cache monitoring there is a record of each time the statement started and when it completed. There are also about 30 pieces of information recorded for each statement. Unlike the Package Cache, you can see things like SQL Error and Warnings for statements that did not complete successfully, the workload the statement was executed in, the estimated query cost for each run and many more.

To help you navigate this wealth of data, you can also group statements a number of different ways and see complete totals or average or every statement. The most common grouping is Group by SQL Statement or by application name or client IP address or by WLM workload. You will be able to see a complete picture of which statements and which users are driving the work on your database.

If you choose to search for a specific statement, you only need to provide a fragment of the SQL. Data Server Manager will search the full history of statements collected that match your keywords. Select the search text beside the magnifying glass icon at the top right side of the list.

All this is very powerful, but comes at a cost, and is not turned on by default. For example if you had a production database that ran a thousand statements a minute, you would add 1.5 million records to your history each day. That could quickly overwhelm the storage of your database if it was not controlled. There is also a performance cost. If you want a complete record of statement history, DSM would write a record to a table each time a statement executes. If you are only running lots of short statements there could be an impact to the performance of the whole system.

Fortunately you can use individual statement history collection as a focused tool and configure exactly what, and where, to collect history and how long to keep it.

Setting Up SQL Statement History

How to turn on Package Cache History Collection

Before Data Server Manager will start collecting the history of the package cache you need to enable the repository database and turn on historical statement collection. For more information on setting up the repository refer to the Data Server Manager Knowledge Center.

To turn on Package Cache Statement history collection you need to update the monitoring profile of your database. Select Settings->Monitoring Profiles. Then select the profile associated with your database and select Edit. Select Repository Persistence and then make sure that SQL statement execution data and Package cache data (top N) are both selected.

You can also select how often Data Server Manager will record the information in the package cache. In this case it collects information every five minutes and selects the top 20 statements for each category of information collected. The defaults are a good place to start. Also remember that by changing these values you are changing how Data Server Manager collects data for all the databases included in this monitoring profile.

What you should know before using Individual Statements History

Before you start to collect the history for individual statements you should understand how Data Server Manager collects the data. Statement history is very powerful, but understanding how it works will ensure that it doesn't use more space or resources than necessary or impact the database you are monitoring.

DB2 Workload Management works together with DB2 Event Monitoring to collect the details behind each statement you want to monitor. When you create a DB2 Workload definition you have the option to also collect detailed information for each statement. These records are recorded in the monitored database in Activity Event Monitor Tables. At regular intervals, Data Server Manager collects the data stored in the Activity Event Monitor Tables and writes the record to the Data Server Manager historical database repository. The Data Server Manager interface lets you navigate the data in the historical repository without impacting the monitored database. However don't be surprised if you don't see the record for a recent statement in Data Server Manager right away. Remember the statements are only collected every few minutes so it might take that long to see a recent statement.

More importantly, after Data Server Manager collects the data from the monitored database, it cleans out the old records by truncating the Activity Event Monitor. This ensures that a large number of records never accumulate in the monitored database using up valuable storage. Logging is turned off for the event monitor tables to reduce the impact to the monitored database.

Enabling the Monitoring Profile The first step is to turn on individual statement data collection in the monitoring profile. Select Settings->Monitoring Profiles in the main menu. Then select the profile associated with your database and select Edit. Select Repository Persistence and then make sure that Individual statement data is selected.

You can also select how many statements to collect each time they are removed from the Event Monitor tables in the monitored database. The default is 1000. You can also set the maximum number of statements to keep in the Data Server Manager historical repository. The default is 1000 but you should set this much higher for most systems. 100,000 or a million rows is not an unreasonable value

If you select Capture in-progress query with event monitor you will also collect information for statements that were in-progress during a DB2 failure. This could be very useful for problem determination.

If you select Use Administrative Task Schedule to disable event monitors in failure cases, Data Server Manager will setup the monitored database to automatically disable event monitoring if DSM has not recently collect and cleaned out data in the Event Monitor tables on the monitored database. This ensures that if Data Server Manager is not up and running, data will not accumulate in the Event Monitor Tables.

Setting up Historical Monitoring for Individual Statements If you try to start using Statement History for Individual Executions after turning it on in the monitoring profile, you will probably see a set of warning messages.

Because you are impacting what might be a production database, Data Server Manager is completely transparent about what needs to be created and gives you complete control. You can follow the actions for each message or you can select Monitor- >Statement History Prerequisites from the main menu. Both will provide you with a sample script that helps you setup space to store the Event Monitor Tables on the monitored database and turn on monitoring for your default workloads.

Setup Storage in the Monitored Database

Create a place to store the Event Monitor Tables The following script creates a 32KB bufferpool, creates a storage group and a tablespace, as well as a required temporary tablespace. You can customize the initial and maximum size of the tablespace in the script before you run it. However the default sizes should be sufficient for most monitoring requirements.

Tell Data Server Manager to use the storage you created You still have to explicitly tell Data Server Manager were to find the Event Monitor Tables in the monitored database. Select Monitor->Statement History Prerequisites from the main menu.

Select the table space you created earlier from the list of available table spaces in the monitored database. Data Server Manager will create the required Event Monitor Tables in your new table space.

Using Workloads to Control which Statements are Collected

The next step is to define and turn on Event Monitoring for the workloads in your database. Workloads are simply ways that DB2 recognizes different kinds of statements and groups them together to simplify monitoring. By default, DB2 includes two workload definitions - one for default user work and one for administrative workloads. The recommended script turns on data collection for each of these by including the COLLECT ACTIVITY DATA ON ALL WITH DETAILS in the workload definition. For use with Data Server Manager, one other workload definition is suggested. DSM_WORKLOAD identifies applications that include a set of names that identify Data Server Manager monitoring queries. For this work we do not include the COLLECT ACTIVITY clause. This ensures that we only collect a history of statements that are doing useful work for your application or administrative tasks. We don't recommend collecting a history of the SQL statements that Data Server Manager uses to collect it own data to reduce the overhead on your database.

You can also define additional workloads and choose which should include data collection. This allows you to be very precise and specific about where you want to collect detailed historical data for all your statements. If you want to define new workloads for your database, Data Server Manager makes it easy.

Select Administration->Workloads from the main menu. Then select Add and select a method to divide up the work in your database.

For example, if you select Add workload by application name, Data Server Manager will look at the history of the applications that ran statements in the database and generate a sample script containing a workload for each application. You can customize the script and combine applications into a single workload or add other definitions. Remember if you want to collect statement history you need to add the COLLECT ACTIVITY DATA ON ALL WITH DETAILS clause to each workload definition statement or alert the workload definition later.

How Db2 Warehouse Private and IIAS is setup to use Individual Statement History

The Db2 Warehouse Private and the IBM Integrated Analytics System console is powered by Data Server Manager and collection of Package Cache history and Individual Statement History are pre-configured and automatically setup for immediate use.

All the following is done for you:

• Storage allocated and table spaces created

• Repository tables and table space created

• Workloads predefined

• Statement history turned on in the default monitoring profile o Number of statements collected is limited to 10,000

• Event monitors created

• Event monitor tables created

The Data Server Manager historical data is stored in the main dashDB Local database.

Managing data retention and securing statement information

There are two important parameters that can help you manage the amount of data collected by Data Server Manager and control access to detailed user information.

Select Settings->Monitoring Profiles and select the profile you are using for your database. Select Repository Persistence and expand SQL statement execution data.

Data Retention The first option lets you limit the number of days that both the Package cache history data and the individual statement history data are retained. While you can limit the number of statements retained for individual statement history this allows you to also automatically remove statements that were collected more than a specific number of days in the past.

Normalizing Statements If you choose to Normalize captured SQL statements, any values in a statement will be replaced by parameter markers (question marks). If you do not choose this option each statement with different values will be treated as a unique statement. This allows you to see exactly the values passed through with each statement but it will dramatically increase the volume of information collected and it is not recommended for normal use. However it may be valuable for resolve some very specific problems. Normalizing statements has the additional benefit of hiding potentially sensitive data that may be imbedded in the SQL statement.

Statement History Detail showing parameter marker replacement.

Directly accessing Package Cache and Individual Statement History using SQL

You can directly access the tables in the Data Server Manager historical repository database using SQL. The history for the package cache as well as for individual statements is collected by DSM are stored in three tables.

• IBMOTS.SQL_DIM: Contains the complete SQL statement for both the Package Cache and Individual Statement monitoring.

• IBMOTS.SQL_FACT: Contains a record for each time a specific statement was found in the Package Cache

• IBM_RTMON_EVMON.EVENT_ACTIVITY: Contains a record for each statement where records were found in the Activity Event Monitor tables

Tables used to collect Individual Statement History

You can write and execute SQL statements against the DSM repository database directly to create your own reports or to combine this data with other accounting data to support activities like charging database users for their use.

To build your own SQL, start by joining the dimension table containing the SQL Text to the fact table that contains the details for each execution or each time the statement was recorded in the package cache.

You can then start to extract meaningful information by summing and grouping the data. Below is an example of how to show the total number of rows returned for each unique SQL statement for the data included in the database.

Anything you see in the Data Server Manager interface can be accessed through the underlying SQL tables.