Rocket UniVerse

Administering the Recoverable

Version 12.1.1

June 2019 UNV-1211-RFSU-1 Notices

Edition Publication date: June 2019 Book number: UNV-1211-RFSU-1 Product version: Version 12.1.1

Copyright © Rocket Software, Inc. or its affiliates 1985–2019. All Rights Reserved.

Trademarks Rocket is a registered trademark of Rocket Software, Inc. For a list of Rocket registered trademarks go to: www.rocketsoftware.com/about/legal. All other products or services mentioned in this document may be covered by the trademarks, service marks, or product names of their respective owners.

Examples This information might contain examples of data and reports. The examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

License agreement This software and the associated documentation are proprietary and confidential to Rocket Software, Inc. or its affiliates, are furnished under license, and may be used and copied only in accordance with the terms of such license.

Note: This product may contain encryption technology. Many countries prohibit or restrict the use, import, or export of encryption technologies, and current use, import, and export regulations should be followed when exporting this product.

2 Corporate information

Rocket Software, Inc. develops enterprise infrastructure products in four key areas: storage, networks, and compliance; database servers and tools; business information and analytics; and application development, integration, and modernization. Website: www.rocketsoftware.com Rocket Global Headquarters 77 4th Avenue, Suite 100 Waltham, MA 02451-1468 USA To contact Rocket Software by telephone for any reason, including obtaining pre-sales information and technical support, use one of the following telephone numbers.

Country Toll-free telephone number United States 1-855-577-4323 Australia 1-800-823-405 Belgium 0800-266-65 Canada 1-855-577-4323 China 400-120-9242 France 08-05-08-05-62 Germany 0800-180-0882 Italy 800-878-295 Japan 0800-170-5464 Netherlands 0-800-022-2961 New Zealand 0800-003210 South Africa 0-800-980-818 United Kingdom 0800-520-0439

Contacting Technical Support The Rocket Community is the primary method of obtaining support. If you have current support and maintenance agreements with Rocket Software, you can access the Rocket Community and report a problem, download an update, or answers to FAQs. To log in to the Rocket Community or to request a Rocket Community account, go to www.rocketsoftware.com/support. In addition to using the Rocket Community to obtain support, you can use one of the telephone numbers that are listed above or send an email to [email protected].

3 Contents

Notices...... 2 Corporate information...... 3 Chapter 1: Introduction to the Recoverable File System...... 7 Recommended knowledge base...... 7 RFS system requirements...... 7 RFS two-process architecture (UNIX only)...... 8 Enabling and setting up RFS...... 8 Excluding individual files and accounts from RFS...... 9 Chapter 2: Logging...... 11 Setting up and configuring logging...... 11 Components of logging...... 12 Log files...... 12 Types of log files...... 12 Log ...... 13 File-level log size...... 14 Log file location...... 14 Log file overflow...... 14 The log configuration file...... 15 Checkpoints...... 17 Chapter 3: Archiving...... 19 Activating and configuring archiving...... 19 Files required for archiving...... 20 The archconfig file...... 21 The mediaconfig file...... 22 Managing archive backup...... 23 Synchronizing backups with archive files...... 23 Backing up archives automatically...... 24 Messages...... 25 Slow backups...... 25 Failed backup...... 27 Starting and stopping the database...... 27 Executing the SUSPEND.FILES ON or uv -admin -L command...... 27 Backing up archives manually...... 28 Chapter 4: Recovering from a system failure...... 30 Chapter 5: Recovering from a media failure...... 33 Data lost, logs and archives unaffected...... 33 Data and archive files unaffected, logs lost...... 36 Data and log files unaffected, archives lost...... 37 Data and logs lost, archives unaffected...... 37 Data and archives lost, logs unaffected...... 41 Logs and archives lost, data unaffected...... 44 Disk containing uvhome lost, archives unaffected...... 45 Chapter 6: Monitoring and tuning RFS...... 49 The uvsysmon utility...... 49 uvsysmon fields and values...... 50 BIG statistics section...... 50 Latching statistics section...... 51 tm status section...... 51 SHM info section...... 51

4 Contents

Log file statistics section...... 52 Lower portion of log file statistics section...... 52 Record info section...... 53 Trans info section...... 53 Performance tips...... 53 Tuning the N_PUT and N_BIG configuration parameters...... 54 Adjusting the log files...... 55 Adjusting the archive files...... 55 UniVerse Active File Table (AFT)...... 55 AFT sections...... 56 AFT hash buckets...... 56 UniVerse session hash buckets...... 57 UniVerse session file limit...... 57 Sizing shared memory segments...... 57 Calculating the system buffer's shared memory segment size...... 57 Calculating the system control shared memory segment size...... 58 Calculating the global lock manager shared memory segment size...... 58 Tuning system parameters for the uvdb user...... 58 Chapter 7: Troubleshooting RFS...... 59 Failure of UniVerse to start...... 59 Logs are too small...... 59 UniVerse daemon killed...... 60 Archive files are full...... 61 Parameter limits exceeded...... 61 MAX_OPEN_FILE...... 61 N_AFT...... 61 Appendix A: RFS commands and daemons...... 62 RFS commands...... 62 uvcntl_install...... 62 uvrfs.control.file...... 63 system.status file...... 63 restart.newblk file...... 64 restart.fileend file...... 64 uvforcecp...... 64 uv -admin -mediarec...... 64 uv -admin -start...... 66 uvunload...... 67 RFS daemons...... 69 uvsm...... 69 uvsh/user...... 70 uvbimglog...... 70 uvaimglog...... 70 uvarchive...... 71 uvar_backupd...... 71 uvsyncd...... 71 Appendix B: RFS configuration parameters...... 72 Modifying uvconfig parameters...... 72 UniVerse RFS parameters...... 72 AIMG_BUFSZ...... 73 AIMG_FLUSH_BLKS...... 74 AIMG_MIN_BLKS...... 74 ARCH_FLAG...... 74 ARCH_WRITE_SZ...... 74 ARCHIVE_BACKUP...... 74 BIMG_BUFSZ...... 75

5 Contents

BIMG_FLUSH_BLKS...... 75 BIMG_MIN_BLKS...... 75 CHKPNT_TIME...... 75 GLM_MEM_SEGSZ...... 75 GRPCMT_TIME...... 76 LOG_OVRFLO...... 76 N_AFT...... 76 N_AFT_SECTION...... 76 N_AFT_SECTION_BUCKET...... 77 N_AIMG...... 77 N_ARCH...... 77 N_BIG...... 77 N_BIMG...... 78 N_PUT...... 78 N_SYNC...... 78 NSEM_PSET...... 78 RFS_DUMP_DIR...... 78 RFS_DUMP_HISTORY...... 79 RFS_MODE...... 79 SB_PAGE_SZ...... 79 SESSION_AFT_BUCKET...... 79 SESSION_NFILES...... 79 SYNC_TIME...... 80

6 Chapter 1: Introduction to the Recoverable File System

Hardware and software failures, power loss, fire, or natural disaster can disrupt processing by causing loss of data consistency or loss of entire data files. The Recoverable File System (RFS) has been added to UniVerse 12.1.1 to protect your database from loss of data due to system and media failures. RFS provides protection by using before-image logging, after-image logging, archiving, and failure recovery. When you install UniVerse, RFS is turned off by default. To enable it, you must configure the RFS_MODE parameter in uvconfig. Instructions for doing this are provided in Enabling and setting up RFS, on page 8. RFS also supports database transaction processing semantics to provide the ACID properties (atomicity, consistency, isolation, and durability). For more information about database transaction processing semantics and the ACID properties, see the UniVerse BASIC User Guide. Recommended knowledge base

Rocket recommends that you have the following knowledge before using RFS: ▪ Expertise with the on which you are running the database. ▪ Experience with UniVerse in general, and UniVerse daemons, in particular. ▪ Experience with administration concepts such as backup and restore. ▪ Knowledge of how transaction processing and RFS work together. RFS system requirements

Running UniVerse with RFS enabled requires additional disk space for RFS logs. Performance can be improved by caching more common usage files in memory. Furthermore, although increased memory might benefit performance, it is not required for UniVerse 12.1. The amount of additional space and memory depends on the type of applications you are running. As of UniVerse 12.1, some shared memory segments have been created to support RFS. Although you can turn RFS off if preferred, these memory segments are still used. The use of shared memory has both drawbacks and benefits. A drawback is that exclusivity locks must be set to ensure that only one process is updating the system buffer at a time. This means that single process operations may take a slightly longer time to complete. One of the benefits is better scalability of your applications running on the UniVerse 12.1 server. To achieve optimal performance of your UniVerse 12.1 system, regardless of whether RFS is enabled or disabled, you need a basic understanding of these segments and how to size them appropriately. For more information, see Sizing shared memory segments, on page 57. Also, starting in UniVerse 12.1, each user connected to the database has an additional process that helps facilitate run-time operations. This process runs as the uvdb user, and tunable system parameters must be configured to handle it. For instructions, see Tuning system parameters for the uvdb user, on page 58.

7 Chapter 1: Introduction to the Recoverable File System

RFS two-process architecture (UNIX only)

As of UniVerse 12.1.1 for UNIX, RFS uses a two-process architecture, which prevents a non-root user or an error from shutting down the RFS processes.

The new RFS architecture uses a process called uvda (UniVerse database agent), which was created by merging the traditional uvsh functionality with the transaction manager (tm) process. As illustrated in the following figure, the uvsh process handles input and output, and network requests, and forwards and receives the input and output to and from its associated uvda process.

The uvda process accesses the RFS internal data structures and performs all of the database engine tasks, including interpreting BASIC and TCL/SQL, and accessing data files. By allowing direct access to the RFS internal data structures, this architecture eliminates costly inter-process communication (IPC), and significantly improves RFS performance. Enabling and setting up RFS

By default, when you install UniVerse, RFS is turned off. You must first enable it by setting the RFS_MODE parameter in uvconfig. Then you must configure logging and, optionally, archiving. Instructions for configuring logging and archiving are provided in the following chapters.

Procedure

1. Enable RFS as follows: a. In a command window, change to the UniVerse . b. Enter the following command to stop UniVerse:

UNIX: bin/uv -admin -stop Windows: bin\uv -admin -stop c. Open the uvconfig file in a text editor, and locate the RFS_MODE parameter. d. Change its setting from 0 to 1, and save and uvconfig.

8 Excluding individual files and accounts from RFS

e. Enter the following command:

UNIX: bin/uvregen Windows: bin\uvregen f. Start UniVerse by entering the following command:

UNIX: bin/uv -admin -start Windows: bin\uv -admin -start 2. Configure logging as described in Logging, on page 11. 3. Optionally, configure archiving as described in Archiving, on page 19.

Note: Although this step is optional, Rocket recommends that you use archiving.

Excluding individual files and accounts from RFS

If there are some files or accounts that you want to designate as non-RFS, or unprotected, you must enter information about them in the rfsconfig file, located in $UVHOME. Each entry in rfsconfig uses syntax that allows you to specify individual files as being unprotected. The syntax also lets you define an entire account as being unprotected, and yet still specify individual files within that account that will remain under RFS protection.

Syntax

Following is the syntax for entries in rfsconfig:

LEVEL=[FILE | ACCOUNT] ACCOUNT=account_name | account_path FILE=[DICT | DATA] [,subfile] EXCLUDED_FILE=[DICT | DATA] […]filename[,subfile][…]

Parameters The following table describes each parameter of the syntax:

Parameter Description LEVEL Indicates whether the entry affects individual files or entire accounts. ▪ LEVEL=FILE indicates that the files defined in the FILE clause are unprotected in the specified account. ▪ LEVEL=ACCOUNT indicates that all files in the specified account are unprotected except for those defined in the EXCLUDED_FILE clause. LEVEL=FILE is the default setting. ACCOUNT Specifies the name or of the account in which files are unprotected. ▪ account_name is the account name defined in UV.ACCOUNT. ▪ account_path is the full path of the account. On UNIX, the path starts with a slash (/). On Windows, the path starts with a character followed by a colon (:).

9 Chapter 1: Introduction to the Recoverable File System

Parameter Description FILE Specifies individual unprotected files when LEVEL=FILE. ▪ DICT indicates that dictionary files matching filename will be unprotected. ▪ DATA indicates that data files matching filename will be unprotected. ▪ filename is a file's entry in the VOC file of the specified account. ▪ For a multilevel file, subfile specifies a subfile that will be unprotected. If you do not specify DICT or DATA, files of both types that match filename are unprotected. EXCLUDED_FILE Specifies individual protected files within the account when LEVEL=ACCOUNT. ▪ DICT indicates that dictionary files matching filename will be protected. ▪ DATA indicates that data files matching filename will be protected. ▪ filename is a file's entry in the VOC file of the specified account. ▪ For a multilevel file, subfile specifies a subfile that will be protected. ▪ … is a wildcard that can be used at the beginning and end of a filename. If you do not specify DICT or DATA, files of both types that match filename are protected.

Examples In the following example, both DICT and DATA files named TEST1, and DATA files named TEST2 will be unprotected in Account1. Account1 is defined in UV.ACCOUNT.

LEVEL=FILE ACCOUNT=Account1 FILE=TEST1 FILE=DATA TEST2

In the following example, all files in Account2 will be unprotected except for any files whose names include "TMP," and the file whose VOC entry is TEST with the subfile TEST1.

LEVEL=ACCOUNT ACCOUNT=/disk1/Account2 EXCLUDED_FILE=...TMP... EXCLUDED_FILE=TEST,TEST1

Note: You can use the LIST.UNPROTECTED.FILE command to view the accounts and files that are unprotected.

10 Chapter 2: Logging

Logging protects recoverable files in the event of a system failure that occurs between the time a piece of data is entered into a system and the time it is recorded in the database. For this type of failure, RFS uses before and after-image log files to restore the database. Before and after-image logging records information about changes to your recoverable database files. If your system crashes, the before-image logs are used to restore the recoverable files to the state they were in at the last completed checkpoint. The after-image logs are then applied to restore the database to the last completed update. For performance purposes, the after-image logs produced by the RFS daemons will only contain the fields in the record that were updated and not the entire record. If archiving is enabled, the archive logs will also contain the fields in the record that were updated and not the entire record. You can enable before and after-image logging without enabling archiving. However, it is recommended that you enable both to provide the best protection for your recoverable data files.

Note: If you have enabled transaction processing, after-image logs are applied to the database through the last complete transaction; partial transactions are not applied.

Setting up and configuring logging

To set up and configure RFS logging, complete the following steps: 1. Perform a full backup of your system. Do not start the database after the backup. 2. Decide how many log files you want to use. You need at least two sets of log files and one file-level log. Each set should contain at least two before-image logs and at least two after-image logs.

Tip: Multiple before and after-image logs in each set allow your system to perform simultaneous writes to logs for improved performance. If you expect RFS to be heavily loaded, consider having more than two before and/or after-image logs per set.

3. Determine the following: ▪ The size of the log files ▪ The size of the file-level log ▪ The location of the log files ▪ The location of the log file overflow 4. Log in as root. 5. Create or edit the logconfig file. Detailed information and examples are provided in The log configuration file, on page 15.

Note: If you installed the database with the RFS option, a logconfig file was created in the uvhome directory. This file is a minimum configuration intended only to allow the database to start with RFS enabled. You must edit the logconfig file to define the proper path and size of your log files. If you did not install the database with the RFS option, you must manually create the logconfig file before the database will start with RFS enabled.

6. In uvconfig, make sure the following parameters are set correctly depending on your logconfig settings:

11 Chapter 2: Logging

▪ The N_AIMG and N_BIMG values much match the number of after-image and before-image logs in each log set respectively. ▪ The CHKPNT_TIME value must be greater than the GRPCMT_TIME value.

▪ If the block size in your logconfig file exceeds 4096, you must increase both the BIMG_BUFSZ and AIMG_BUFSZ values to be multiples of the block size. ▪ The file-level log setting must be the NUSERS value plus 1. See UniVerse RFS parameters, on page 72 for detailed descriptions of these and other RFS parameters. 7. If you want to enable archiving now, proceed to Archiving, on page 19. Otherwise, proceed to the next step. 8. Run the uvcntl_install command. This command allocates space for your log files and initializes counters.

Note: Make sure the database is not running and that you have logged in as root before running uvcntl_install.

9. Start the database by using the uv -admin -start command with no options to implement your new uvconfig parameters. Components of logging

The following sections describe the files, parameters, and configuration that are integral to how RFS uses logging to recover from system failures: • Log files This section describes the types of log files that UniVerse uses and provides other information about the files, such as size, location, and overflow. • The log configuration file The log configuration file (uvhome/logconfig) acts as an index to the log files associated with RFS. The database uses the information from logconfig to create the log files. • Checkpoints The CHKPNT_TIME uvconfig parameter defines the checkpoint interval, which is the number of seconds between flushes of the system buffer to disk.

Log files

This section describes the types of log files that UniVerse uses and provides other information about the files, such as size, location, and overflow. Parent topic: Components of logging

Types of log files

UniVerse uses the following types of log files: ▪ Before-image log files ▪ After-image log files ▪ File-level log files

12 Log file size

Before-image log files When you update database files, the database first writes a copy of the unaltered file blocks to a before-image log file. If your system crashes, the database uses these before-image logs to restore your files to the state they were in at the last-completed checkpoint, and applies after-image logs to restore through the last-completed update before the crash. Then RFS updates the files with information from the after-image log files.

After-image log files When you update database files, the database does not the changes directly to the files. Instead, it records the changes in an after-image log file and to the system buffer. If your system crashes, the database can recover your files to a state that existed before the crash by first reading the before- image log files and writing them back into the files, and then reading the after-image log files and writing the changes recorded in them back into the files.

File-level log files A file-level log file stores a record of operations that affect an entire file rather than affecting the contents of a file. Commands that produce entries in a file-level log file include:

▪ CREATE.FILE ▪ DELETE.FILE ▪ CLEAR.FILE ▪ CLEARFILE (BASIC statement) ▪ CNAME ▪ CREATE.INDEX ▪ DELETE.INDEX ▪ BUILD.INDEX ▪ ENABLE.INDEX ▪ DISABLE.INDEX During crash recovery, the database uses the file-level log file to recover these actions and to prompt you to redo the ones it cannot restore.

Crash recovery attempts to recover CLEAR.FILE and any completed file-level operations automatically, except for index operations. If a file-level operation is incomplete, the database prints a message in the FileInfo file located in uvhome. Media recovery attempts to recover all file-level operations except for index operations.

Log file size

There is no set formula for calculating the size of your logs. As an initial estimate, multiply the number of records expected for update during one checkpoint interval by the largest record size. This will give you an approximation of the number of bytes needed for your logs. Divide this number by the block size you have chosen for your log files and you have an estimate of the number of blocks needed. For optimal performance in UNIX, Rocket recommends that each of your before-image and after- image logs use the UNIX file system block size. To determine the UNIX file system block size, use the UNIX df command. If you cannot determine your UNIX file system block size, the recommended size is 4096.

13 Chapter 2: Logging

Note: Rocket recommends that you refrain from using odd block sizes, such as 3K, 5K, 7K, or 9K in your logconfig file.

Warning: If the block size in your logconfig file exceeds 4096, you must also increase the AIMG_BUFSZ and BIMG_BUFSZ configuration parameters. These parameters must be a multiple of the block size defined in logconfig and cannot exceed the log block size multiplied by the log length.

File-level log size

The size of the file-level log depends on the uvconfig parameter NUSERS. Check the uvconfig file to see the setting for this parameter. The size of the file-level log must be at least NUSERS+1 (the value of the NUSERS parameter plus one). If you change the value of NUSERS, you must change the size of your file-level log and run uvcntl_install.

Log file location

In UNIX, logs may be located on raw disk partitions or within UNIX file structures.

Note: You should place your logs on a different physical device from your recoverable data files. That way, a single disk failure will not affect both logs and data. To improve performance, you can also put the groups of before-image and after-image logs on different devices, but make sure you designate the same block size for all log files.

UNIX users may also want to create the log files as raw disk files, which are files on a disk partition or device that is not mounted in the UNIX file system. The impact of using raw disk depends on your operating system; refer to your host operating system documentation for information about using raw disk.

Log file overflow

The database allows log files to overflow to a directory path, which you can define with the LOG_OVRFLO parameter. For best results, specify a path that is on a different physical device from your files. Be sure you specify a path that has disk space available.

The LOG_OVRFLO parameter is contained in the uvconfig file, which is located in the uvhome directory. When you install the database with the RFS option, the LOG_OVRFLO parameter is automatically set with the default path of uvhome/rfslog/overflow.

Warning: If you have not defined the LOG_OVRFLO parameter, or the path defined does not exist and your log files overflow unexpectedly, the database shuts down to protect its integrity. A correlation exists between the size of your transaction and the overflow behavior of log files. Because it is virtually impossible to determine when log files might overflow, you should allow for the possibility.

14 The log configuration file

The log configuration file

The log configuration file (uvhome/logconfig) acts as an index to the log files associated with RFS. The database uses the information from logconfig to create the log files. If you include RFS when you install the database, it automatically creates log files using a default log configuration file. The purpose of this default configuration file is to get your system up and running. You need to modify this file to match your system.

Contents of the logconfig file

Each line in the logconfig file represents a log file. For example:

/uvhome/rfslog/aimgbimg/a_0000 021 4096 0 5120 Each log file contains five attributes, and each attribute is separated by a tab using the following format: filename TAB flag TAB block_size TAB start_block TAB log_length The attributes are described in the following table:

Attribute Description filename The full path and file name of the log file. You can use any file name. flag Select the appropriate flag:

▪ 021 describes an after-image log ▪ 022 describes a before-image log ▪ 0120 describes a file-level log block_size The block size of the log file. A UNIX block size should be a multiple of the file system block size. A raw disk file should be a multiple of the disk sector size. Rocket recommends a 4096- byte block size for both UNIX file system and raw disk logs. The block size cannot exceed 16384. start_block The start block offset in the log file. You can put several logical files into a single actual file by specifying the same filename but different start blocks. If you do this, make sure that the logs do not overlap. log_length The log file size as specified by the number of blocks in the log file.

The entries in the logconfig file must be in the following order: ▪ First group of after-image logs ▪ First group of before-image logs ▪ Second group of after-image logs ▪ Second group of before-image logs ▪ File-level log file

Modifying uvconfig parameters

When you customize your logconfig file, you may need to change the values of the N_BIMG, N_AIMG, BIMG_BUFSZ and AIMG_BUFSZ, and NUSERS parameters.

15 Chapter 2: Logging

The N_BIMG and N_AIMG parameters are related to the number of before and after-image log files that you define. The BIMG_BUFSZ and AIMG_BUFSZ parameters are related to the block size of your log files. The NUSERS parameter is related to the size of the file-level log. See UniVerse RFS parameters, on page 72 for detailed descriptions of these and other RFS parameters.

Example UNIX logconfig file with default settings

The following example shows a logconfig file containing log files that are on a UNIX file system.

# cat logconfig /uvhome/rfslog/aimgbimg/a_0000 021 4096 0 5120 /uvhome/rfslog/aimgbimg/a_0001 021 4096 0 5120 /uvhome/rfslog/aimgbimg/b_0000 022 4096 0 5120 /uvhome/rfslog/aimgbimg/b_0001 022 4096 0 5120 /uvhome/rfslog/aimgbimg/a_0002 021 4096 0 5120 /uvhome/rfslog/aimgbimg/a_0003 021 4096 0 5120 /uvhome/rfslog/aimgbimg/b_0002 022 4096 0 5120 /uvhome/rfslog/aimgbimg/b_0003 022 4096 0 5120 /uvhome/rfslog/aimgbimg/filelog_0 0120 4096 0 257

This logconfig file identifies two sets of log files, as described in the following table:

Row Contents 1 and 2 First log set; after-image logs. 3 and 4 First log set; before-image logs. 5 and 6 Second log set; after-image logs. 7 and 8 Second log set; before-image logs. 9 File-level log.

Notice that the starting block offset is zero, because each log is a separate file.

Example UNIX logconfig file using raw disk files The following example shows a log configuration table for UNIX log files that are raw disk files.

Tip: Rocket recommends that the start block offset is not 0 for raw disk files.

$ more uvhome/logconfig /dev/rdsk/0s4 021 4096 1024 1024 /dev/rdsk/0s4 021 4096 2048 1024 /dev/rdsk/0s4 021 4096 3072 1024 /dev/rdsk/0s4 022 4096 4096 1024 /dev/rdsk/0s4 022 4096 5120 1024 /dev/rdsk/0s4 022 4096 6144 1024 /dev/rdsk/0s4 022 4096 7168 1024 /dev/rdsk/0s4 021 4096 8192 1024 /dev/rdsk/0s4 021 4096 9216 1024 /dev/rdsk/0s4 021 4096 10240 1024 /dev/rdsk/0s4 022 4096 11264 1024 /dev/rdsk/0s4 022 4096 12288 1024 /dev/rdsk/0s4 022 4096 13312 1024 /dev/rdsk/0s4 022 4096 14336 1024 /dev/rdsk/0s4 0120 4096 15360 39

16 Checkpoints

This logconfig file defines two sets of logs. Each set contains three after-image logs and four before-image logs, as shown in the following table:

Row Contents 1 through 3 First log set; after-image logs. 4 through 7 First log set; before-image logs. 8 through 10 Second log set; after-image logs. 11 through 14 Second log set; before-image logs. 15 File-level log.

Notice that, because the logs are part of a raw disk partition rather than a file system, the starting block offset is required.

Log sets Log sets must match each other. The number and size of before-image logs in the first set must be the same as the number and size of before-image logs in the second set, and the number and size of after- image logs in the first set must be the same as the number and size of after-image logs in the second set. Within each set, the number of before-image logs can be different from the number of after-image logs. The before-image logs may not fill at the same rate as the after-image logs. Before-image logs contain blocks and after-image logs contain records, and one before-image block can correspond to a large number of after-image records.

Setting the LOGCONFIG environment variable If you create your log configuration table in a location other than the default, you need to set the LOGCONFIG environment variable to the absolute path of your log configuration table. The UNIX syntax is provided in the following examples: Syntax for C shell:

setenv LOGCONFIG /directory/logconfig

Syntax for Bourne or Korn shell:

LOGCONFIG=/directory/logconfig;export LOGCONFIG

Tip: If you are using the LOGCONFIG environment variable to identify the location of your log configuration table, that environment variable must be correctly set whenever you start the database with the uv -admin -start command. Consider setting the environment variable in a startup script.

Parent topic: Components of logging

Checkpoints

The CHKPNT_TIME uvconfig parameter defines the checkpoint interval, which is the number of seconds between flushes of the system buffer to disk.

17 Chapter 2: Logging

By default, this parameter is set to 300 seconds. When the checkpoint interval is reached (or when any log file reaches 80 percent full), the checkpoint manager (uvcm) daemon) performs the following series of actions: ▪ Checks the system buffer to see if any updates were performed since the last checkpoint. If so, continues with the next step. If not, waits for next checkpoint interval (or the next time a log file reaches 80 percent full).

▪ Blocks the uvsh/user processes from initiating new transactions. ▪ Sends messages to all uvaimglog and uvbimglog daemons to flush all pages from their buffers to the log files on disk. ▪ Switches log sets, defining the set just written to as inactive and activating the second set. ▪ Marks all dirty pages in the system buffer. (A dirty page indicates that updates have been made but not yet copied to disk.)

▪ Wakes up uvsh/user processes, allowing them to initiate new transactions. ▪ If archiving is enabled, tells the archive process to save the after image logs. ▪ Flushes dirty pages to disk, updating your database.

Note: Because uvsh/user processes are blocked from starting new transactions during a checkpoint, setting the checkpoint interval too short may impact system performance. You need to balance that impact against the fact that a longer checkpoint interval means more space required for log files.

Parent topic: Components of logging

18 Chapter 3: Archiving

Archiving protects files in the event of a media failure or crash. A variety of conditions can cause a file or files to become unreadable unless restored from a backup. The archiving feature maintains a complete record of updates since your last backup. When the archiving feature is enabled, in the event of a media crash, you can restore your files from a backup copy, and then apply the archives. Doing this brings your database to the state it was in when the last archive was written. Then, when you start the database, the current archive files are updated with the latest after-image log files.

Note: Although you can enable before and after-image logging without enabling archiving, Rocket recommends that you enable both to provide the best protection for your data files.

Activating and configuring archiving

You activate archiving by setting uvconfig parameters and by creating an archive configuration file (archconfig). You also need to decide how to save your archive files, and you must create a media configuration file (mediaconfig) for use during recovery.

Tip: If you need to improve system performance in UNIX, Rocket recommends that you create the archive files (not the archconfig file) as raw disk files, which are files on a disk partition or device that is not mounted in a UNIX file system. The impact of using raw disk depends on your hardware and software environment. Refer to your host operating system documentation for information about using raw disk.

1. Perform a full backup of your system. If you just finished turning on logging, you can skip this step and proceed to step 2. Otherwise, create and verify a full backup.

Important: Do not start the database when the backup completes; the database must be down for the remainder of these steps.

2. Decide how many archive files you need. Rocket recommends that you use at least two archive files, which is the default number. You may want to use more than two, especially if you know that RFS will be heavily loaded.

Note: If you are using raw disk for your archives, you can put several logical archive files into a single actual file. To do this, specify the same file name for each archive file, but specify different start blocks. Make sure that the archive files do not overlap.

3. Determine how large the archive files should be. Archive files need to be larger than log files. Log files are normally overwritten several times in an hour, whereas archive files need to hold a large number of changes. Unlike log files, archive files cannot overflow. Your smallest archive file must be larger than one full set of after-image log files. If it is not, the uvcntl_install command will fail and display an error message. 4. Determine where to locate the archive files. The optimum configuration places data, logs, and archives each on a separate physical device. 5. Log in as root. 6. Create the archconfig file using a text editor. The default path for the file is uvhome/ archconfig.

19 Chapter 3: Archiving

The archconfig file contains a list of archive files that you define using specific attributes, including filename, blocksize, start block, and log length. For more information about the archconfig file, including examples, see The archconfig file, on page 21. 7. Make sure that the value specified for the N_ARCH uvconfig parameter matches the number of archive files that you defined in the archconfig file. 8. Create the mediaconfig file using a text editor. The default path for the file is uvhome/ mediaconfig. The mediaconfig file contains absolute paths for the media configuration parameters, TMP_ARCH_SPACE and TMP_CP_SPACE. TMP_ARCH_SPACE defines the location in which archive files will be loaded when you execute the uv -admin -mediarec command. TMP_CP_SPACE defines the location in which the contents of the archive will copied. For an example of a mediaconfig file and recommended settings for these parameters, see The mediaconfig file, on page 22. 9. If you want to back up your archives automatically, perform the following steps. (If you prefer to back up the archives manually, skip to step 10. a. In uvconfig, set the ARCHIVE_BACKUP parameter to 1. b. Configure the uvarch_backup and uvarch_restore scripts with your tape device information. These scripts are located in uvhome/bin. c. Set up the archive tape device. For detailed information about configuring automatic backup, see Backing up archives automatically, on page 24. You can now proceed to step 11. 10. If you want to back up the archives manually, you must back them up whenever the set fills. The system displays messages in one or more of the following: ▪ In the uvsm.log file. ▪ To the terminal (or window) from which you executed uv -admin -start, if that terminal or window remains available.

Tip: UNIX: If you are monitoring uvsm.log, use the UNIX tail -f command. This command allows you to see messages as they are written to the log.

For additional information about manual backup, see Backing up archives manually, on page 28. 11. In uvconfig, set the ARCH_FLAG parameter to 1 to turn on archiving. 12. Run uvcntl_install. This command allocates space for your archive files and initializes the logical sequence numbers in the uvrfs.control.file. The uvcntl_install command initializes both logging and archiving. 13. Start the database. Use uv -admin -start with no options to implement your new uvconfig parameters. Files required for archiving

The following sections describe the files that you must create to enable archiving functionality: • The archconfig file To activate archiving, you must create an archive configuration file named archconfig in uvhome that contains a number of archive files. The archconfig file acts as an index to the archive files that the database uses to recover transaction files after a media crash. The archive files that you define in the archconfig file store a chronological record of changes to your files.

20 The archconfig file

• The mediaconfig file During recovery from a media crash, a media configuration file called mediaconfig is used to determine where on your system to load archive files and after-image logs before applying them to the database.

The archconfig file

To activate archiving, you must create an archive configuration file named archconfig in uvhome that contains a number of archive files. The archconfig file acts as an index to the archive files that the database uses to recover transaction files after a media crash. The archive files that you define in the archconfig file store a chronological record of changes to your files. The database copies each after-image log set to an archive file before overwriting the log set. When the first archive file fills up, the database moves to the second file, and so on. As each file fills, the database assigns it a logical sequence number (LSN), which is used by the uv -admin -mediarec command to identify archive files for recovery. When all of the archive files have filled, the database waits until they have been copied to reliable storage before allowing any further processing. You can back up the archives manually when the system prompts you, or you can enable automatic archive backup. The archiving system alerts you to move archive files to storage, and tells you how to label the files.

Contents of the archconfig file

Each line in the archconfig file represents an archive file. For example: /uvhome/rfslog/archive/arch0 4096 0 102400 When defining and archive file, separate each attribute with a tab. For example: filename TAB block_size TAB start_block TAB log_length Each archive file is defined using four attributes, which are described in the following table:

Attribute Description filename The full path and file name of the archive file. You can use any file name. block_size The block size of the archive file. A UNIX block size should be a multiple of the file system block size. A UNIX file size should be a multiple of the file system block size. A raw disk file should be a multiple of the disk sector size. Rocket recommends that you use a 4096- byte block size for both UNIX file system and raw disk archives. The block size cannot exceed 16,384. start_block The start block offset in the archive file in UNIX. log_length The archive file size as specified by the number of blocks in the archive file.

Example UNIX archive configuration table

Following is an example of an archconfig file in UNIX:

$ more uvhome/archconfig /uvhome/rfslog/archive/arch0 4096 0 102400 /uvhome/rfslog/archive/arch1 4096 0 102400

21 Chapter 3: Archiving

Example UNIX archive configuration table for raw disk archive files

The following example shows an archconfig file for four archive files that are raw disk files:

Warning: Rocket recommends that the start block offset is not 0 for raw disk files.

$ more uvhome/archconfig /dev/rdsk/0s5 4096 8192 8192 /dev/rdsk/0s5 4096 16384 8192 /dev/rdsk/0s5 4096 24576 8192 /dev/rdsk/0s5 4096 32768 8192

Note: If you are using raw disk partitions, and you put your archive files on the same raw disk partition as your log files, do not give the archive files the same name and start block as the log files. Give your archive files start blocks that will not conflict with each other, or with your log files.

Setting the ARCHCONFIG environment variable

If you create your archconfig file somewhere other than the default location, set the ARCHCONFIG environment variable to the full path of the file. Syntax (C shell):

setenv ARCHCONFIG /directory/archconfig Syntax (Bourne and Korn shell):

ARCHCONFIG=/directory/archconfig;export ARCHCONFIG

Parent topic: Files required for archiving

The mediaconfig file

During recovery from a media crash, a media configuration file called mediaconfig is used to determine where on your system to load archive files and after-image logs before applying them to the database.

The default path for the media configuration file is uvhome/mediaconfig.

Contents of the mediaconfig file

The mediaconfig file contains the media configuration parameters TMP_ARCH_SPACE and TMP_CP_SPACE. TMP_ARCH_SPACE defines the absolute path to a location in which archive files will be loaded when you execute the uv -admin -mediarec command. TMP_CP_SPACE defines the absolute path to a location in which the contents of the archive will copied.

The following example shows a mediaconfig file:

$ more mediaconfig TMP_ARCH_SPACE=/uvhome/rfslog/mediarec_tmp/ARCH TMP_CP_SPACE=/uvhome/rfslog/mediarec_tmp/CP

Note: You must define absolute paths including file names. You do not have to create the ARCH and CP files beforehand. The uv -admin -mediarec command creates them during execution.

22 Managing archive backup

Locate TMP_ARCH_SPACE on a disk with enough space to hold the largest archive file. To calculate the amount of space needed (in bytes), refer to your archive configuration table, identify the largest file, and multiply the block size by the file length. Locate TMP_CP_SPACE on a disk with enough space to hold the largest set of after-image log files, with some room left over for log file overflow. To calculate the amount of space needed, complete the following steps: 1. Refer to your log configuration table. Add together the log lengths of each after image file in your first log set. 2. Multiply the sum of lengths by the block size to calculate the space (in bytes) that you need. Make sure the location of TMP_CP_SPACE has additional space available to handle log overflows.

Creating a MEDIACONF environment variable If you create your media configuration file somewhere other than the default location, set the MEDIACONF environment variable to the full path of the file. Syntax for C shell in UNIX:

setenv MEDIACONF /directory/mediaconfig Syntax for Bourne or Korn shells in UNIX:

MEDIACONF=/directory/mediaconfig;export MEDIACONF If you are using the MEDIACONF environment variable to identify the location of your media configuration file, that environment variable must be set correctly whenever you run uv -admin - mediarec. In UNIX, consider setting the environment variable in a startup script.

Parent topic: Files required for archiving Managing archive backup

This section provides information about the importance of keeping your backups and archive files synchronized. It also provides in-depth descriptions of the processes for configuring automated archive backup and for backing up archives manually.

Synchronizing backups with archive files

You must keep your archives synchronized with your backups so that the current archive set starts as of the most recent backup.

If you stop the database before you perform backups, execute the uvcntl_install command after you have completed and verified your backup. The command clears out the archive files and re- initializes counters so that when you start the database, you will begin writing to your first archive file.

If you executed the SUSPEND.FILES ON or uv -admin -L command prior to performing your backup, you do not need to execute uvcntl_install command. The after-image logs are written to the archive files, and the time at which the after-image logs were flushed, along with the next logical sequence number (LSN), are written to the uvsm.log file. This information is necessary if you need to run the uv -admin -mediarec command using the archive files created after the backup commenced.

23 Chapter 3: Archiving

Backing up archives automatically

With automated archive backup, the need for intervention should be rare.

Tip: If you are using automatic archive backup, writing to tape can be a slow process. Consider using automatic backup to copy your archives to disk files rather than to tape. You can off-load the disk files to tape at your convenience.

When automated backup has been enabled, a daemon called uvar_backupd invokes a user- customized backup script in the background whenever an archive file fills. No user input is required unless your script fails for some reason, or an archive takes too long to back up. The uvar_backupd daemon writes certain messages to uvsm.log. The database also writes messages that require intervention to the window where you executed uv -admin -start (if it is available). The automated backup and restore scripts write messages to files called uvarch_backup.out and uvarch_restore.out respectively, in the same directory as the scripts. Use these to monitor and verify your backups and restores.

Procedure

1. To enable automated archive backup, change the value of the ARCHIVE_BACKUP parameter in the uvconfig file to 1. 2. Set up the backup and restore scripts. The database provides default scripts to back up your archive files to tape. (UNIX scripts use the dd command.) The default paths for these scripts are: ▪ uvhome/bin/uvarch_backup ▪ uvhome/bin/uvarch_restore If you use these default scripts, modify them to set the archive backup device to the actual device name you are using. In UNIX, the archive backup device is $TAPEDEV in each script and should be set to the actual UNIX device name. You can create your own backup and restore scripts if you prefer. Be sure the scripts are compatible, so that archive backups can be read by the restore script. Use the standard exit code scheme: code 0 indicates a successful exit, and a non-zero code indicates a failure.

Note: If the path and file names for your archive and restore scripts/routines are different from the default, identify them with environment variables or add them as parameters in the uvconfig file.

Note: If you are using environment variables to identify the location of your uvarch_backup and uvarch_restore scripts, those environment variables must be correctly defined whenever you run the uv -admin -mediarec command. In UNIX, consider setting the environment variables in a startup script.

3. If you are backing up to tape, set up the archive tape device. Rocket recommends that you make a dedicated device available for fast and dependable backups. The sample scripts do not open the device exclusively, so if another application uses the device, archive files may be overwritten. If you are backing up to disk, Rocket recommends periodically saving the backed up files from disk to tape or to another long-term archive medium.

24 Messages

Messages

Messages for automated backup display in uvsm.log, and in the window from which you executed uv -admin -start (if available). You should check them from time to time; if your backup script fails for any reason, you need to intervene. All messages are displayed in uvsm.log. Messages that require you to take action also go in the uv -admin -start window. In UNIX, use the UNIX tail -f command if you are monitoring the uvsm.log file. This command allows you to see messages as they are written to the log. When the automated backup is running without problem or delay, the database only writes messages to the uvsm.log file. As each file fills, a message is written to the uvsm.log as shown in the following examples:

The archive file uvhome/rfslog/archive/arch0 is full. The Logical Sequence Number (LSN) of this archive is –- 0

When the uvar_backupd process begins to off-load the file, the database writes the following message to the uvsm.log file:

starting to offload archive uvhome/rfslog/archive/arch0 (LSN = 0) ar_backupd, Mon Oct 23 16:33:07 2017: file (LSN = 0) off-loaded.

At this point, your archive backup script is running while the database writes to the next archive file in your archive configuration table. The system monitors the backup process, and writes messages like the following example to the uvsm.log file at intervals until the process completes:

The archive file uvhome/rfslog/archive/arch1 is full. The Logical Sequence Number (LSN) of this archive is –- 1

When the first backup completes, uvar_backupd writes a message to the uvsm.log file, then checks to see if the next file is ready to offload. When the next file is ready, the backup begins, as shown in the following example:

starting to offload archive uvhome/rfslog/archive/arch1 (LSN = 1) ar_backupd, Mon Oct 23 16:35:57 2017: file (LSN = 1) off-loaded.

Slow backups

If backing up an archive file takes more than 10 minutes, the database writes a message to the uvsm.log, and to the uv -admin -start window. The message looks like the following example:

uvar_backupd, Wed Oct 25 14:26:15 2017: waiting for the offloading of file (LSN = 0) to complete ... uvar_backupd, Wed Oct 25 14:26:17 2017: The uvhome/bin/uvarch_backup script (pid 16395) off-loading the file uvhome/rfslog/archive/arch0 (LSN 0) has taken more than 10 minutes without completing. Check the status of the script and take appropriate corrective action. The output of the script can be found in the file uvhome/bin/uvarch_backup.out.

At this point, complete the following three steps:

25 Chapter 3: Archiving

1. End the archive backup script process. 2. Back up the archive file manually, and label it with the correct LSN. 3. Create a file named DONE. The following example shows how to create the DONE file in UNIX:

$ touch uvhome/DONE

After you back the archive file up manually and create the DONE file, uvar_backupd begins backing up the next log file that is full. If your archive backup script off-loads files slower than they are written to, you may encounter the situation where the database needs to write to an archive that has not yet been backed up. The database generates messages similar to the following examples:

uvar_backupd, Wed Oct 25 14:17:05 2017: off-loading of archive uvhome/rfslog/archive/arch0 (lsn 0) must be completed before system can progress.

This first message displays at the uv -admin -start window and in the uvsm.log file. The following message appears in the uvsm.log file:

ARCH: waiting for the file (uvhome/rfslog/archive/arch0) to be off-loaded ...

At the point where these messages display, database processing will wait until the file is off-loaded. If the backup completes, processing will automatically resume.

Note: While the database processing is waiting, many database commands (even including uv - admin -stop) are blocked. It will appear as if the system is hung.

If you see these messages, and the off-load does not complete in a few minutes, perform the following steps: 1. Check the status of the backup script/process. If the process is running normally, proceed to step 2. Otherwise, go to step 3. 2. Check the output file uvarch_backup.out. If it indicates the backup has failed, proceed to step 4. If it does not indicate a problem, you can either wait longer for the backup to complete or go to step 4. 3. Identify and resolve external problems. If it is possible to unblock the process, do so. The database should resume normal processing. If you cannot unblock the process, go to step 4. 4. Log on as root. Use the kill command to kill the backup script process. Kill any child processes of the backup script process. Make sure no copies of the backup script are running. Refer to your host operating system documentation for information about checking process status, unblocking a process, and ending/killing a process. If you end/kill the backup script/ process, the database displays additional instructions to the uvsm.log, the uv -admin - start window, and the console. The instructions look like the following example:

The archive file uvhome/rfslog/archive/arch0 can not be off loaded. You may want to make sure that this file has been saved. Also, label it Logical Sequence Number (LSN) – 0 Please create uvhome/DONE file (as root) when done... 5. Copy or back up the file. 6. Create the DONE file. Following is an example of how to create the DONE file in UNIX:

$ touch uvhome/DONE

Once you save the archive file and create the DONE file, the database processing should resume normally. If the backup the system is waiting for takes more than ten minutes to complete, the

26 Failed backup

database will display the messages described earlier for slow backup, prompting you to off-load the file and create a DONE file.

Failed backup

If your archive backup script fails to complete, you will see messages similar to the following example:

The archive file uvhome/rfslog/archive/arch0 can not be off loaded. You may want to make sure that this file has been saved. Also, label it Logical Sequence Number (LSN) – 0 Please create uvhome/DONE file (as root) when done...

These messages display in the uvsm.log file, at the uv -admin -start window, and at the console. Complete the following steps to resolve the problem:

1. Check the output file from the script, uvarch_backup.out. Identify and resolve the problems that caused the script to fail. 2. Manually back up the file that could not be off-loaded, and label it with the correct LSN. 3. Create the DONE file.

Once you have corrected the problem, saved the archive file, and created the DONE file, database processing should continue normally.

Note: If a backup script process fails or you kill it, there is a possibility that the process partly backed up the archive file before failing. If you need to restore that archive file during a media recovery, be sure to use the copy you backed up manually.

Starting and stopping the database

Before you stop the database with the uv -admin -stop command, make sure all archive to tape copy operations are complete. This ensures that all filled archives have been backed up before you stop the database. This is an important step to preserve ordering of your archives and ensure a smooth start when you execute uv -admin -stop. Use the UNIX ps command to make sure there are no copies of your backup script running, and check the uvsm.log file to be sure all filled archives have been off-loaded.

If you execute uv -admin -stop while your archive backup script is still running, the system will not identify the file being backed up as “off-loaded.” The next time the database needs to write to that file, you may see some puzzling messages, and you will have to manually back up the file.

Warning: If your system crashes with some or all archives full (but not off-loaded), you may experience delays or system hangs when you start the database. You should still be able to recover, but you must perform additional manual steps. Keep careful note of each step you perform to make sure you have preserved all your archives. Contact your VAR or Rocket Software technical support if you need assistance.

Executing the SUSPEND.FILES ON or uv -admin -L command

When you execute the SUSPEND.FILES ON or uv -admin -L command to block updates to your system, generally to perform a backup, the database forces a checkpoint, flushes the after-image logs to the archive files, and marks the next available logical sequence number (LSN) in the archive file for use after the backup. The database displays this information on the screen where you execute SUSPEND.FILES ON or uv -admin -L, and writes it to uvhome/uvsm.log.

27 Chapter 3: Archiving

After you perform a system backup, the archives created prior to the backup are no longer needed. If you need to run the uv -admin -mediarec command after the system backup, it is important to know the time of the checkpoint after you execute SUSPEND.FILES ON or uv -admin -L, and which LSN the database will use when you execute SUSPEND.FILES OFF or uv -admin -U. The following example shows the output from the uv -admin -L command:

# uv -admin -L CheckPoint time before ForceCP: Wed Nov 1 16:50:34 2017 .CheckPoint time after ForceCP: Wed Nov 1 16:52:09 2017 .CP has been forced successfully. Forcearch completed, the next LSN is 10.

There is no need to stop the database to issue the uvcntl_install command if you use SUSPEND.FILES ON or uv -admin -L in conjunction with your system backups. For more information about the SUSPEND.FILES ON or uv -admin -L commands, see the UniVerse manuals, Administering UniVerse on Windows and UNIX Platforms and User Reference.

Backing up archives manually

If you do not use automated archive backup, you must back up your archive files manually whenever the log set fills.

As each file fills, the database writes a message to uvsm.log. Messages requiring intervention also go to the uv -admin -start window. During initial setup, you should determine how you want to set up your system to be sure of seeing the messages. If you do not start the database from a terminal (for example, if the database starts automatically when you boot your system) and your configuration does not include a console, you have to monitor uvsm.log on a regular basis to receive the messages. If all the archives fill, and you do not back them up, eventually the database processing will stop. The message looks like the following examples:

archive,Tue Oct 24 13:43:57 2017: The archive file uvhome/rfslog/archive/arch0 is full. The Logical Sequence Number (LSN) of this archive is –- 0

If you are monitoring the uvsm.log file, you can back up the file as soon as you see the message or wait until all the files fill. When all the files fill, the database writes a message to the uvsm.log file, the console, and the uv -admin -start window. This message looks like the following example:

The current set of archive files is FULL. You may want to make sure that these files have been saved. Also, label them from Logical Sequence Number (LSN) 0 thru 1 in the order they appear in uvhome/archconfig file. Please create uvhome/DONE file when done...

When you see this message, you need to back the files up promptly, because processing stops when all archives are full. Once you complete the backup and create the DONE file, processing resumes.

Note: The message is the same regardless if you backed up the files as they filled. Whether you back them up one at a time or after the set fills, you need to create the DONE file before the database can begin reusing the archives.

28 Backing up archives manually

The following example shows how to create the DONE file:

# touch uvhome/DONE

You can back up the files to tape, disk, or any reliable storage. Because writing to tape can be a slow process, you may want to consider copying your archive files to another location on disk and creating the DONE file, then off-loading the files to tape at your convenience.

29 Chapter 4: Recovering from a system failure

Some failures interrupt processing between the time a piece of data is entered into a system and the time it is recorded in the database, causing the data image that is in shared memory not to match what is in the database. These failures (called system failures or system crashes in this document) interrupt processing without damaging files, and the database uses logging to recover.

About this task Before-image and after-image logging protects data by recording information about changes made to your files. If your system crashes, the database uses the before-image logs to restore your files to the state they were in at the last completed checkpoint. Then the database applies the after-image logs to restore itself up to the last completed update.

Note: If you have enabled transaction processing, the database applies after-image logs through the last complete transaction. No partial transactions are written.

When you restart the database after a system failure, the system monitor (uvsm) detects that a crash occurred. The database automatically does the following: ▪ Identifies the log set that was active at the time of the crash. ▪ Applies all before-image blocks in the current before-image log. ▪ Reviews the current after-image log set and applies changes as appropriate. ▪ Reviews the file-level log and performs those operations that can be recovered automatically.

The database writes a message to the uvsm.log directing you to a file called FileInfo in uvhome, which lists those file-level operations that cannot be recovered automatically. ▪ Writes the after-image log files to the archive files.

Procedure

1. Preserve uvsm.log. While the database is down, make a copy of uvhome/uvsm.log. Look at the log for information related to the crash. 2. Execute the uv -admin -start command with no options.

Note: If your log files are very large or have overflowed, or if you have automatic archive backup turned on, crash recovery may take several minutes.

3. Verify the recovery. After you start the database, check the current uvhome/uvsm.log file. The following example shows what the uvsm.log looks like when you do not need to handle file- level operations: ----- SM (39137) is started at Oct 26 2017 10:57:06 -----

UniVerse Environment : uvhome

Thu Oct 26 10:57:06 SM: Restart_Flag = 1. Replication is off. Undo logfile[2] ..... The logfile[2] has been scanned. Undo logfile[2] has been undone. Undo logfile[3] ..... The logfile[3] has been scanned. Undo logfile[3] has been undone.

30 Recovering from a system failure

Total undo log records : 33330 Old blocks processed : 33330 SM: U_Ginfo->sb_mode=3 Starting to restart ...... restart: Report of Log-file Status.

Type Checkpoint-Number Status 0 after-image 0 0 1 after-image 0 4 2 before-image 0 4 3 before-image 0 4 4 after-image 0 4 5 after-image 0 4 6 before-image 0 4 7 before-image 0 4 Link back all new blocks .... Link new blocks successful ! Start the redo processing ...... Total redo log records : 663416 Update operations : 663416

Finish the redo processing. All undo/redo recovery processing has been finished!!

Redo logfiles finished.

Step3: Check the file level commands...

File level commands checking finished.

Restart is successful!!!

****!!! Restart Finished !!!**** Checking log files ..... The system is running normally. SM: U_Ginfo->sb_mode=3 If your uvsm.log looks like this first example, go to step 5. The next example shows what uvsm.log looks like if you need to repeat file-level operations:

----- SM (43550) is started at Oct 26 2017 14:37:23 -----

UniVerse Environment : uvhome

Thu Oct 26 14:37:23 SM: Restart_Flag = 1. Replication is off. Undo logfile[2] ..... The logfile[2] has been scanned. Undo logfile[2] has been undone. Undo logfile[3] ..... The logfile[3] has been scanned. Undo logfile[3] has been undone. Total undo log records : 122779 Old blocks processed : 113476 New blocks processed : 9303 SM: U_Ginfo->sb_mode=3 SM: U_Ginfo->sb_mode=3 Starting to restart ...... restart: Report of Log-file Status.

Type Checkpoint-Number Status

31 Chapter 4: Recovering from a system failure

0 after-image 2 0 1 after-image 2 0 2 before-image 2 0 3 before-image 2 0 4 after-image 1 4 5 after-image 1 0 6 before-image 1 0 7 before-image 1 0 Link back all new blocks .... Link new blocks successful ! Start the redo processing ....

Finish the redo processing. All undo/redo recovery processing has been finished!!

Redo logfiles finished. Step3: Check the file level commands... 1 sessions were doing file level commands when system was crashed, please check 'uvhome/FileInfo' for detailed information.

File level commands checking finished.

Restart is successful!!!

****!!! Restart Finished !!!**** Checking log files ..... The system is running normally. SM: U_Ginfo->sb_mode=3

If your uvsm.log looks like this, you need to complete step 4.

Note: A before-image log is sometimes called an undo log, and an after-image log is sometimes called a redo log. You will notice that convention in these examples, and in other messages in the uvsm.log.

4. Repeat file-level operations. If your uvsm.log directs you to check the uvhome/FileInfo file, you may need to perform additional steps before letting your users access the database. The following example shows a FileInfo file from a crash recovery: LCT:1, account-path:uvhome, had 1 unfinished file operation when the system was crashed. code=31 (RESIZE) dictflg=0, directfname=0. owner=root key=TEST content=RESIZE TEST 3 88887 2 revision=3 Before redo the command, you may need to delete the resize : Repeat the operations listed in FileInfo, in the order they appear in the file.

Note: When you perform each file-level operation, you need to be in the database account that contains the VOC entry for the affected file. When you have completed file-level recovery, check file permissions in your database accounts to be sure users have correct access to the data.

5. Resume normal processing. Recovery should be complete.

32 Chapter 5: Recovering from a media failure

Media failures can cause a file or files to become unreadable unless restored from a backup. For these failures, the archiving feature protects files by maintaining a complete record of updates since your last backup. You can restore your files from backup and then use crash recovery and the uv -admin -mediarec command to apply archives, bringing your database to the state it was in when the last archive was written.

Before you execute the uv -admin -mediarec command, it is important to understand the type of failure you had, and how that failure has affected your data, logs, and archives. While it is impossible to guarantee complete recovery in the case of a multiple-disk failure, the scenarios in the following sections describe the actions you should take to bring your data to as consistent a state as possible.

Note: It is important to keep your archives synchronized with your backups so that your current archive set starts as of the most recent backup. For more information, see Synchronizing backups with archive files, on page 23.

Data lost, logs and archives unaffected

The following steps assume that you have lost your data, but you have not lost the uvhome directory, logs, or archives. You should be able to recover to the last completed transaction. 1. Check and correct external problems. Identify and resolve hardware and software problems that are external to the database. 2. Check uvsm.log. Note any unusual conditions that may have contributed to the crash. Make a copy of uvsm.log in case you need to refer back to it. 3. Make a copy of the uvrfs.control.file. The database saves all of the archive sequence numbers in the uvrfs.control.file, which is located in the uvhome directory. Assuming the file is intact, make a copy of it to protect against accidentally overwriting it during your restore. You should have the current uvrfs.control.file to restore your database through automatic recovery. If the uvrfs.control.file is damaged or destroyed, you can still recover your database, but you will need to know the logical sequence numbers since the last backup and execute the uv -admin -mediarec command. 4. Start the database by executing the uv -admin -start command with no options. Make sure the database started successfully. Depending on the failure, uv -admin -start may have performed automatic crash recovery. By performing crash recovery, you will update the current set of archive files on disk with the latest changes to your database. This will allow the uv -admin -mediarec command to recover to the last completed transaction. 5. Stop the database by executing the uv -admin -stop command so that you can proceed with restoring the system. 6. Restore the system from the last full backup. Restoring your system re-creates the state your database was in when you created the backup, putting the data is in a consistent state. Check to make sure you have the correct uvrfs.control.file. You want the one you saved in step 3. The database uses absolute paths in recovery. You need to restore the file system exactly as it was at backup. If you are using a different physical device, make sure you configure the file system so the absolute paths remain the same.

33 Chapter 5: Recovering from a media failure

Note: Your ability to recover from a media failure depends on complete, verified full backups.

7. Check the mediaconfig file, located in the uvhome directory. This file contains pointers to the areas where the uv -admin -mediarec command creates working files. Check those areas and clear disk space. Make a note of how much space is available. 8. Execute the uv -admin -mediarec command.

Note: Be aware of the following: ▪ To execute the uv -admin -mediarec command, you must be logged in as root and the database must not be running.

▪ If you executed the uvcntl_install command after your last full backup, the uv - admin -mediarec command displays the number of the archive file to upload. ▪ If you paused the database by executing the SUSPEND.FILES ON or uv -admin -L command prior to performing your backup, you must execute uv -admin -mediarec with the -T option to provide the logical sequence number after the backup. You may also use the -s option to provide the checkpoint time after the forced checkpoint. The -T option is recommended.

When you execute the SUSPEND.FILES ON or uv -admin -L command, the database displays information on the terminal screen and writes it to the uvsm.log. The following example shows the output from the uv -admin -L command:

# uv -admin -L CheckPoint time before ForceCP: Wed Nov 1 16:50:34 2017 .CheckPoint time after ForceCP: Wed Nov 1 16:52:09 2017 .CP has been forced successfully. Forcearch completed, the next LSN is 10.

The last twenty uvsm.log files are appended in the uvsm.log file located in uvhome/ saved_logs. If the command output is not in the uvsm.log file located in uvhome, check the saved uvsm.log file. ▪ If you used the automated archiving feature to save your archive files, use the uvarch_restore script, located in uvhome/bin, to restore them. The script will upload the files as they are needed. If that script fails, the screen will prompt you to load the archive files by logical sequence number. ▪ If you have automated archive backup turned on, and you had to manually back up one or more archive files, consider doing all the restores manually to be absolutely sure you do not restore a partial archive. The database detects if a partial archive is restored, and the uv -admin -mediarec command may fail.

The following example shows the first mediarec response:

# bin/uv -admin -mediarec -T0

For media recovery, you would be required to have space for two temporary files, one to hold the largest archive file and another to hold the largest CP size. Please more information, please read documentation about media recovery procedure and re-start media recovery.

Max CP Size (in bytes): 316416 Max Arch File Size (in bytes): 419430400

Also, if you're planning to use the tape(s) created by archive process,

34 Data lost, logs and archives unaffected

please setup restore script uvhome/bin/arch_restore properly (tape device) and load the first archive tape.

Do you want to continue (y/n) [n]?

If you do not have enough space, enter n when you are asked if you want to continue. The uv - admin -mediarec command exits. Resolve the space problem and re-enter the command. The uv -admin -mediarec command prompts you to load archive files by logical sequence number. The following example shows what the screen looks like when the recovery process is complete. In this example, no file-level recovery is needed:

****!!! Media Recovery Finished !!!**** SM stopped successfully.

The next example shows what the uvsm.log file looks like when file-level recovery is involved:

# cat uvhome/uvsm.log media_file_recovery_repstyle() : op=CREATE.FILE, cwd=/home/jsmith/testarch, started: Creating file "BB" as Type 3, Modulo 7, Separation 2. Creating file "D_BB" as Type 3, Modulo 1, Separation 2. Added "@ID", the default record for RetrieVe, to "D_BB". media_file_recovery_repstyle() : op=CREATE.FILE, done ret=1...... media_file_recovery_repstyle() : op=CNAME, cwd=/home/jsmith/testarch, started: Changed operating system file name from "BB" to "BBBB". Changed operating system file name from "D_BB" to "D_BBBB". Changed "BB" to "BBBB" in your VOC file. media_file_recovery_repstyle() : op=CNAME, done ret=1. redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=BUILD.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F2

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST

35 Chapter 5: Recovering from a media failure

content=BUILD.INDEX TEST F2

...... Total number of checkpoints : 2 Total redo log records : 1869 Update operations : 1850 Delete operations : 6 Create.file : 1 Cname.file : 1 Index related operations : 4 Unfinished operations : 4

Please check uvhome/FileInfo for un-recovered file level operations.

****!!! Media Recovery Finished !!!**** SM stopped successfully.

If your screen looks like the last example, copy or print the FileInfo file before proceeding. You need this file for step 11. The uv -admin -mediarec command attempts automatic recovery of all file-level operations except index operations.

After the uv -admin -mediarec command applies all of the archives that you restored from backup, it applies the current archive set on disk. Some time can elapse between the last prompt for you to restore a file and the successful completion of the command. This is normal. 9. Change to the directories noted in your mediaconfig file, and remove the last set of working files. 10. Start the database by using the uv -admin -start command with no options. 11. Manually complete file-level recovery. Use the FileInfo file you copied or printed in step 8 to complete file-level operations from the uv -admin -mediarec command. Complete the steps in order.

CREATE.INDEX TEST F1 BUILD.INDEX TEST F1 CREATE.INDEX TEST F2 BUILD.INDEX TEST F2

To complete each file-level operation, you need to be in the database account where the VOC entry for the affected file resides. Depending on how your backup utility works and how your restore was done, you may need to reset file permissions in your database accounts so that users have proper access to your data. 12. Stop the database with the uv -admin -stop command. 13. Before you start the database, Rocket highly recommends that you perform a full system backup. 14. Execute the uvcntl_install command to reinitialize logging and archiving. 15. If you have automated archive backup turned on and you are backing up to tape, mount a new tape on your archive backup device. 16. Start the database using uv -admin -start with no options. Users should be able to log on and access the data. Data and archive files unaffected, logs lost

You should be able to recover to the last archived checkpoint; you may be able to recover to the last committed transaction.

36 Data and log files unaffected, archives lost

The graceful shutdown design detects media failures and other error conditions involving the log files. The database flushes changed records in the system buffer to your database and stops the database. Complete the following steps for recovery.

1. Check and correct external problems. Identify and resolve hardware and software problems that are external to the database. 2. Check uvsm.log. Note any unusual conditions that may have contributed to the crash. Make a copy of uvsm.log in case you need to refer back to it. 3. Execute the uvcntl_install command to reinitialize logging and archiving. 4. Start the database using uv -admin -start with no options. Users should be able to log on and access the data. Data and log files unaffected, archives lost

You should be able to recover to the last completed checkpoint. The graceful shutdown design detects media failures and other error conditions involving the archive files. The database flushes changed records in the system buffer to your database and stops the database. Complete the following steps for recovery.

1. Check and correct external problems. Identify and resolve hardware and software problems that are external to the database. 2. Check uvsm.log. Note any unusual conditions that may have contributed to the crash. Make a copy of uvsm.log in case you need to refer back to it. 3. Start the database by using the uv -admin -start command with no options. 4. Stop the database with the uv -admin -stop command. 5. Perform and verify a full backup of your system. 6. Execute the uvcntl_install command to reinitialize logging and archiving. 7. If you have automated archive backup turned on and you are backing up to tape, mount a new tape on your archive backup device. 8. Start the database using uv -admin -start with no options. Users should be able to log on and access the data. Data and logs lost, archives unaffected

In this situation, you can only recover to the last successfully archived checkpoint. 1. Check and correct external problems. Identify and resolve hardware and software problems that are external to the database. 2. Check uvsm.log. Note any unusual conditions that may have contributed to the crash. Make a copy of uvsm.log in case you need to refer back to it. 3. Make a copy of the uvrfs.control.file. The database saves all of the archive sequence numbers in the uvrfs.control.file, which is located in the uvhome directory. Assuming the file is intact, make a copy of it to protect against accidentally overwriting it during your restore. You should have the current uvrfs.control.file to restore your database through automatic recovery. If the uvrfs.control.file is damaged or destroyed, you can still recover your database, but you will need to know the logical sequence numbers since the last backup and execute the uv -admin -mediarec command. 4. Restore the system from the last full backup. Restoring your system re-creates the state your database was in when you created the backup, putting the data is in a consistent state.

37 Chapter 5: Recovering from a media failure

Check to make sure you have the correct uvrfs.control.file. You want the one you saved in step 3. The database uses absolute paths in recovery. You need to restore the file system exactly as it was at backup. If you are using a different physical device, make sure you configure the file system so the absolute paths remain the same.

Note: Your ability to recover from a media failure depends on complete, verified full backups.

5. Check the mediaconfig file, located in the uvhome directory. This file contains pointers to the areas where the uv -admin -mediarec command creates working files. Check those areas and clear disk space. Make a note of how much space is available. 6. Execute the uv -admin -mediarec command.

Note: Be aware of the following: ▪ To execute the uv -admin -mediarec command, you must be logged in as root and the database must not be running.

▪ If you executed the uvcntl_install command after your last full backup, the uv - admin -mediarec command displays the number of the archive file to upload. ▪ If you paused the database by executing the SUSPEND.FILES ON or uv -admin -L command prior to performing your backup, you must execute uv -admin -mediarec with the -T option to provide the logical sequence number after the backup. You may also use the -s option to provide the checkpoint time after the forced checkpoint. The -T option is recommended.

When you execute the SUSPEND.FILES ON or uv -admin -L command, the database displays information on the terminal screen and writes it to the uvsm.log. The following example shows the output from the uv -admin -L command:

# uv -admin -L CheckPoint time before ForceCP: Wed Nov 1 16:50:34 2017 .CheckPoint time after ForceCP: Wed Nov 1 16:52:09 2017 .CP has been forced successfully. Forcearch completed, the next LSN is 10.

The last twenty uvsm.log files are appended in the uvsm.log file located in uvhome/ saved_logs. If the command output is not in the uvsm.log file located in uvhome, check the saved uvsm.log file. ▪ If you used the automated archiving feature to save your archive files, use the uvarch_restore script, located in uvhome/bin, to restore them. The script will upload the files as they are needed. If that script fails, the screen will prompt you to load the archive files by logical sequence number. ▪ If you have automated archive backup turned on, and you had to manually back up one or more archive files, consider doing all the restores manually to be absolutely sure you do not restore a partial archive. The database detects if a partial archive is restored, and the uv -admin -mediarec command may fail.

The following example shows the first mediarec response:

# bin/uv -admin -mediarec -T0

For media recovery, you would be required to have space for two temporary files, one to hold the largest archive file and another to hold the largest CP size. Please more information, please

38 Data and logs lost, archives unaffected

read documentation about media recovery procedure and re-start media recovery.

Max CP Size (in bytes): 316416 Max Arch File Size (in bytes): 419430400

Also, if you're planning to use the tape(s) created by archive process, please setup restore script uvhome/bin/arch_restore properly (tape device) and load the first archive tape.

Do you want to continue (y/n) [n]?

If you do not have enough space, enter n when you are asked if you want to continue. The uv - admin -mediarec command exits. Resolve the space problem and re-enter the command. The uv -admin -mediarec command prompts you to load archive files by logical sequence number. The following example shows what the screen looks like when the recovery process is complete. In this example, no file-level recovery is needed:

****!!! Media Recovery Finished !!!**** SM stopped successfully.

The next example shows what the uvsm.log file looks like when file-level recovery is involved:

# cat uvhome/uvsm.log media_file_recovery_repstyle() : op=CREATE.FILE, cwd=/home/jsmith/testarch, started: Creating file "BB" as Type 3, Modulo 7, Separation 2. Creating file "D_BB" as Type 3, Modulo 1, Separation 2. Added "@ID", the default record for RetrieVe, to "D_BB". media_file_recovery_repstyle() : op=CREATE.FILE, done ret=1...... media_file_recovery_repstyle() : op=CNAME, cwd=/home/jsmith/testarch, started: Changed operating system file name from "BB" to "BBBB". Changed operating system file name from "D_BB" to "D_BBBB". Changed "BB" to "BBBB" in your VOC file. media_file_recovery_repstyle() : op=CNAME, done ret=1. redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=BUILD.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F2

39 Chapter 5: Recovering from a media failure

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=BUILD.INDEX TEST F2

...... Total number of checkpoints : 2 Total redo log records : 1869 Update operations : 1850 Delete operations : 6 Create.file : 1 Cname.file : 1 Index related operations : 4 Unfinished operations : 4

Please check uvhome/FileInfo for un-recovered file level operations.

****!!! Media Recovery Finished !!!**** SM stopped successfully.

If your screen looks like the last example, copy or print the FileInfo file before proceeding. You need this file for step 11. The uv -admin -mediarec command attempts automatic recovery of all file-level operations except index operations.

After the uv -admin -mediarec command applies all of the archives that you restored from backup, it applies the current archive set on disk. Some time can elapse between the last prompt for you to restore a file and the successful completion of the command. This is normal. 7. Change to the directories noted in your mediaconfig file, and remove the last set of working files. 8. Start the database by using the uv -admin -start command with no options. 9. Manually complete file-level recovery. Use the FileInfo file you copied or printed in step 8 to complete file-level operations from the uv -admin -mediarec command. Complete the steps in order.

CREATE.INDEX TEST F1 BUILD.INDEX TEST F1 CREATE.INDEX TEST F2 BUILD.INDEX TEST F2

To complete each file-level operation, you need to be in the database account where the VOC entry for the affected file resides. Depending on how your backup utility works and how your restore was done, you may need to reset file permissions in your database accounts so that users have proper access to your data. 10. Stop the database with the uv -admin -stop command. 11. Before you start the database, Rocket highly recommends that you perform a full system backup. 12. Execute the uvcntl_install command to reinitialize logging and archiving. 13. If you have automated archive backup turned on and you are backing up to tape, mount a new tape on your archive backup device. 14. Start the database using uv -admin -start with no options. Users should be able to log on and access the data.

40 Data and archives lost, logs unaffected

Data and archives lost, logs unaffected

Because the current set of log files are of no use, you can only recover to the last complete checkpoint on the last archive file you backed up. 1. Check and correct external problems. Identify and resolve hardware and software problems that are external to the database. 2. Check uvsm.log. Note any unusual conditions that may have contributed to the crash. Make a copy of uvsm.log in case you need to refer back to it. 3. Restore the system from the last full backup. Restoring your system re-creates the state your database was in when you created the backup, putting the data is in a consistent state. Check to make sure you have the correct uvrfs.control.file. You want the one you saved in step 3. The database uses absolute paths in recovery. You need to restore the file system exactly as it was at backup. If you are using a different physical device, make sure you configure the file system so the absolute paths remain the same.

Note: Your ability to recover from a media failure depends on complete, verified full backups.

4. Check the mediaconfig file, located in the uvhome directory. This file contains pointers to the areas where the uv -admin -mediarec command creates working files. Check those areas and clear disk space. Make a note of how much space is available. 5. Execute the uv -admin -mediarec command.

41 Chapter 5: Recovering from a media failure

Note: Be aware of the following: ▪ To execute the uv -admin -mediarec command, you must be logged in as root and the database must not be running.

▪ If you executed the uvcntl_install command after your last full backup, the uv - admin -mediarec command displays the number of the archive file to upload. ▪ If you paused the database by executing the SUSPEND.FILES ON or uv -admin -L command prior to performing your backup, you must execute uv -admin -mediarec with the -T option to provide the logical sequence number after the backup. You may also use the -s option to provide the checkpoint time after the forced checkpoint. The -T option is recommended.

When you execute the SUSPEND.FILES ON or uv -admin -L command, the database displays information on the terminal screen and writes it to the uvsm.log. The following example shows the output from the uv -admin -L command:

# uv -admin -L CheckPoint time before ForceCP: Wed Nov 1 16:50:34 2017 .CheckPoint time after ForceCP: Wed Nov 1 16:52:09 2017 .CP has been forced successfully. Forcearch completed, the next LSN is 10.

The last twenty uvsm.log files are appended in the uvsm.log file located in uvhome/ saved_logs. If the command output is not in the uvsm.log file located in uvhome, check the saved uvsm.log file. ▪ If you used the automated archiving feature to save your archive files, use the uvarch_restore script, located in uvhome/bin, to restore them. The script will upload the files as they are needed. If that script fails, the screen will prompt you to load the archive files by logical sequence number. ▪ If you have automated archive backup turned on, and you had to manually back up one or more archive files, consider doing all the restores manually to be absolutely sure you do not restore a partial archive. The database detects if a partial archive is restored, and the uv -admin -mediarec command may fail.

The following example shows the first mediarec response:

# bin/uv -admin -mediarec -T0

For media recovery, you would be required to have space for two temporary files, one to hold the largest archive file and another to hold the largest CP size. Please more information, please read documentation about media recovery procedure and re-start media recovery.

Max CP Size (in bytes): 316416 Max Arch File Size (in bytes): 419430400

Also, if you're planning to use the tape(s) created by archive process, please setup restore script uvhome/bin/arch_restore properly (tape device) and load the first archive tape.

Do you want to continue (y/n) [n]?

If you do not have enough space, enter n when you are asked if you want to continue. The uv - admin -mediarec command exits. Resolve the space problem and re-enter the command.

42 Data and archives lost, logs unaffected

The uv -admin -mediarec command prompts you to load archive files by logical sequence number. The following example shows what the screen looks like when the recovery process is complete. In this example, no file-level recovery is needed:

****!!! Media Recovery Finished !!!**** SM stopped successfully.

The next example shows what the uvsm.log file looks like when file-level recovery is involved:

# cat uvhome/uvsm.log media_file_recovery_repstyle() : op=CREATE.FILE, cwd=/home/jsmith/testarch, started: Creating file "BB" as Type 3, Modulo 7, Separation 2. Creating file "D_BB" as Type 3, Modulo 1, Separation 2. Added "@ID", the default record for RetrieVe, to "D_BB". media_file_recovery_repstyle() : op=CREATE.FILE, done ret=1...... media_file_recovery_repstyle() : op=CNAME, cwd=/home/jsmith/testarch, started: Changed operating system file name from "BB" to "BBBB". Changed operating system file name from "D_BB" to "D_BBBB". Changed "BB" to "BBBB" in your VOC file. media_file_recovery_repstyle() : op=CNAME, done ret=1. redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=BUILD.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F2

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=BUILD.INDEX TEST F2

...... Total number of checkpoints : 2 Total redo log records : 1869 Update operations : 1850 Delete operations : 6

43 Chapter 5: Recovering from a media failure

Create.file : 1 Cname.file : 1 Index related operations : 4 Unfinished operations : 4

Please check uvhome/FileInfo for un-recovered file level operations.

****!!! Media Recovery Finished !!!**** SM stopped successfully.

If your screen looks like the last example, copy or print the FileInfo file before proceeding. You need this file for step 11. The uv -admin -mediarec command attempts automatic recovery of all file-level operations except index operations.

After the uv -admin -mediarec command applies all of the archives that you restored from backup, it applies the current archive set on disk. Some time can elapse between the last prompt for you to restore a file and the successful completion of the command. This is normal. 6. Change to the directories noted in your mediaconfig file, and remove the last set of working files. 7. Start the database by using the uv -admin -start command with no options. 8. Manually complete file-level recovery. Use the FileInfo file you copied or printed in step 8 to complete file-level operations from the uv -admin -mediarec command. Complete the steps in order.

CREATE.INDEX TEST F1 BUILD.INDEX TEST F1 CREATE.INDEX TEST F2 BUILD.INDEX TEST F2

To complete each file-level operation, you need to be in the database account where the VOC entry for the affected file resides. Depending on how your backup utility works and how your restore was done, you may need to reset file permissions in your database accounts so that users have proper access to your data. 9. Stop the database with the uv -admin -stop command. 10. Perform a full system backup before you start the database. This step is required. 11. Execute the uvcntl_install command to reinitialize logging and archiving. 12. If you have automated archive backup turned on and you are backing up to tape, mount a new tape on your archive backup device. 13. Start the database using uv -admin -start with no options. Users should be able to log on and access the data. Logs and archives lost, data unaffected

The graceful shutdown design detects media failures and other error conditions involving the log files and archive files. The database flushes changed records in the system buffer to your database and stops the database. Complete the following steps for recovery.

1. Execute the uvcntl_install command to reinitialize logging and archiving. 2. Run the fixtool utility in each data directory to ensure that your data files are not corrupted. 3. If the fixtool utility detected errors, execute the uvfixfile command to repair the affected files. For information about fixtool and uvfixfile, see the UniVerse User Reference. 4. Perform a full system backup before you start the database. This step is required. 5. If you have automated archive backup turned on and you are backing up to tape, mount a new tape on your archive backup device.

44 Disk containing uvhome directory lost, archives unaffected

6. Start the database using uv -admin -start with no options. Users should be able to log on and access the data. Disk containing uvhome directory lost, archives unaffected

You should be able to recover to the last archived checkpoint. 1. Restore the uvhome directory from the last backup tape. 2. Ask users to exit the database, then stop the database with the uv -admin -stop command. 3. Execute the uvcntl_install command to re-create the system.status, restart.fileend, restart.newblk files, and to reinitialize the log and archive files. 4. Restore the system from the last full backup. Restoring your system re-creates the state your database was in when you created the backup, putting the data is in a consistent state. Check to make sure you have the correct uvrfs.control.file. You want the one you saved in step 3. The database uses absolute paths in recovery. You need to restore the file system exactly as it was at backup. If you are using a different physical device, make sure you configure the file system so the absolute paths remain the same.

Note: Your ability to recover from a media failure depends on complete, verified full backups.

5. Check the mediaconfig file, located in the uvhome directory. This file contains pointers to the areas where the uv -admin -mediarec command creates working files. Check those areas and clear disk space. Make a note of how much space is available. 6. Execute the uv -admin -mediarec command.

45 Chapter 5: Recovering from a media failure

Note: Be aware of the following: ▪ To execute the uv -admin -mediarec command, you must be logged in as root and the database must not be running.

▪ If you executed the uvcntl_install command after your last full backup, the uv - admin -mediarec command displays the number of the archive file to upload. ▪ If you paused the database by executing the SUSPEND.FILES ON or uv -admin -L command prior to performing your backup, you must execute uv -admin -mediarec with the -T option to provide the logical sequence number after the backup. You may also use the -s option to provide the checkpoint time after the forced checkpoint. The -T option is recommended.

When you execute the SUSPEND.FILES ON or uv -admin -L command, the database displays information on the terminal screen and writes it to the uvsm.log. The following example shows the output from the uv -admin -L command:

# uv -admin -L CheckPoint time before ForceCP: Wed Nov 1 16:50:34 2017 .CheckPoint time after ForceCP: Wed Nov 1 16:52:09 2017 .CP has been forced successfully. Forcearch completed, the next LSN is 10.

The last twenty uvsm.log files are appended in the uvsm.log file located in uvhome/ saved_logs. If the command output is not in the uvsm.log file located in uvhome, check the saved uvsm.log file. ▪ If you used the automated archiving feature to save your archive files, use the uvarch_restore script, located in uvhome/bin, to restore them. The script will upload the files as they are needed. If that script fails, the screen will prompt you to load the archive files by logical sequence number. ▪ If you have automated archive backup turned on, and you had to manually back up one or more archive files, consider doing all the restores manually to be absolutely sure you do not restore a partial archive. The database detects if a partial archive is restored, and the uv -admin -mediarec command may fail.

The following example shows the first mediarec response:

# bin/uv -admin -mediarec -T0

For media recovery, you would be required to have space for two temporary files, one to hold the largest archive file and another to hold the largest CP size. Please more information, please read documentation about media recovery procedure and re-start media recovery.

Max CP Size (in bytes): 316416 Max Arch File Size (in bytes): 419430400

Also, if you're planning to use the tape(s) created by archive process, please setup restore script uvhome/bin/arch_restore properly (tape device) and load the first archive tape.

Do you want to continue (y/n) [n]?

If you do not have enough space, enter n when you are asked if you want to continue. The uv - admin -mediarec command exits. Resolve the space problem and re-enter the command.

46 Disk containing uvhome directory lost, archives unaffected

The uv -admin -mediarec command prompts you to load archive files by logical sequence number. The following example shows what the screen looks like when the recovery process is complete. In this example, no file-level recovery is needed:

****!!! Media Recovery Finished !!!**** SM stopped successfully.

The next example shows what the uvsm.log file looks like when file-level recovery is involved:

# cat uvhome/uvsm.log media_file_recovery_repstyle() : op=CREATE.FILE, cwd=/home/jsmith/testarch, started: Creating file "BB" as Type 3, Modulo 7, Separation 2. Creating file "D_BB" as Type 3, Modulo 1, Separation 2. Added "@ID", the default record for RetrieVe, to "D_BB". media_file_recovery_repstyle() : op=CREATE.FILE, done ret=1...... media_file_recovery_repstyle() : op=CNAME, cwd=/home/jsmith/testarch, started: Changed operating system file name from "BB" to "BBBB". Changed operating system file name from "D_BB" to "D_BBBB". Changed "BB" to "BBBB" in your VOC file. media_file_recovery_repstyle() : op=CNAME, done ret=1. redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=BUILD.INDEX TEST F1

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=24 (CREATEINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=CREATE.INDEX TEST F2

redo_record(): Warning: account level operation is ignored, display only. account path=/home/jsmith/testarch code=25 (BUILDINDEX) dictflg=0, directfname=0. owner=jsmith key=TEST content=BUILD.INDEX TEST F2

...... Total number of checkpoints : 2 Total redo log records : 1869 Update operations : 1850 Delete operations : 6

47 Chapter 5: Recovering from a media failure

Create.file : 1 Cname.file : 1 Index related operations : 4 Unfinished operations : 4

Please check uvhome/FileInfo for un-recovered file level operations.

****!!! Media Recovery Finished !!!**** SM stopped successfully.

If your screen looks like the last example, copy or print the FileInfo file before proceeding. You need this file for step 11. The uv -admin -mediarec command attempts automatic recovery of all file-level operations except index operations.

After the uv -admin -mediarec command applies all of the archives that you restored from backup, it applies the current archive set on disk. Some time can elapse between the last prompt for you to restore a file and the successful completion of the command. This is normal. 7. Change to the directories noted in your mediaconfig file, and remove the last set of working files. 8. Start the database by using the uv -admin -start command with no options. 9. Manually complete file-level recovery. Use the FileInfo file you copied or printed in step 8 to complete file-level operations from the uv -admin -mediarec command. Complete the steps in order.

CREATE.INDEX TEST F1 BUILD.INDEX TEST F1 CREATE.INDEX TEST F2 BUILD.INDEX TEST F2

To complete each file-level operation, you need to be in the database account where the VOC entry for the affected file resides. Depending on how your backup utility works and how your restore was done, you may need to reset file permissions in your database accounts so that users have proper access to your data. 10. Stop the database with the uv -admin -stop command. 11. Before you start the database, Rocket highly recommends that you perform a full system backup. 12. Execute the uvcntl_install command to reinitialize logging and archiving. 13. If you have automated archive backup turned on and you are backing up to tape, mount a new tape on your archive backup device. 14. Start the database using uv -admin -start with no options. Users should be able to log on and access the data.

48 Chapter 6: Monitoring and tuning RFS

The configuration and use of RFS depends on your hardware and software environments, and the type of software you are running. RFS includes a utility called uvsysmon that lets you monitor the activity of the RFS system and to determine how your system could be tuned effectively for performance. The uvsysmon utility

The uvsysmon utility monitors the performance of RFS. Although uvsysmon makes no direct recommendations about tuning your system, you can use the uvsysmon utility display to help you make decisions about the database configuration parameters.

To use uvsysmon, enter the command at the UNIX prompt.

Syntax

uvsysmon [-b |-m] [-o filename] [-t nn] [-s screens]

Parameters The following table describes each parameter of the syntax.

Parameter Action -b Displays detailed information about the Block Index table (BIG) in shared memory. You cannot use the -m parameter with the -b para- meter. -m Displays detailed information about user requests. You cannot use the -b parameter with the -m parameter. -o filename Directs uvsysmon output to filename. -t nn Re-displays the data every nn seconds. -s screens Specifies how many screens to display before exiting.

By default, uvsysmon re-displays every three seconds. You can specify a different sampling interval with the -t option.

Example

The following example shows the uvsysmon display. The fields are described in the sections following the example.

======BLOCK INDEX GROUP (BIG) STATISTICS ======Sun Feb 12 16:03:39 2017 PinRead :238 TmRead :4 Dirty:20 Hits :2868 PinWrite :2634 TmWrite:0 Neat :25 HitRate:99.66% PinWaitQ :0 CmRead :0 Total:45 PinWaitRate:0.00% CmWrite:0

======LATCHING STATISTICS ======TM STATUS === Type----WaitQ---Latches-WaitRate-PollCall-PollRate Tm# :2 Req#:576 Big : 0 5606 0.00% 0 0.00% ActTm:1 Aft : 0 10 0.00% 0 0.00% Aimg: 0 544 0.00% 0 0.00% === SHM INFO ==== Bimg: 0 20 0.00% 0 0.00% ShmPV:1 Total:21

======LOG FILE STATISTICS ======

49 Chapter 6: Monitoring and tuning RFS

TmBimgFlush:0 Wait00.2 LogCkSuccess:4 BimgRawBlks:0 TmAimgFlush:0 Wait00.0 LogCkFail :2 AimgRawBlks:2 CmBimgFlush:0 WaitQ2:0 LogOvrflos :0 TotRaw :2 CmAimgFlush:0 WaitQ3:2 LogSwitchd :8 LogID-Total_Length 0 1 0 ======RECORD INFO ======TRANS INFO === 1 1 0 RecRead : 24 AvgRead : 15 Committed: 1 2 1 0 RecWrite : 535 AvgWrite: 17 Aborted : 0 3 1 0 RecDelete: 0 TotLength:0 uvsysmon fields and values

The following tables describe the fields on the uvsysmon display. In many cases, you can use these fields to help you determine the best settings for tunable parameters in the uvconfig file. RFS configuration parameters, on page 72 describes parameters that are specific to RFS. For a full list of parameters, see Administering UniVerse on Windows and UNIX Platforms.

BIG statistics section

This table describes the fields in the Block Index Group (BIG) Statistics section of the uvsysmon screen:

Field name Description PinRead The number of database blocks locked for reading during the sampling interval. PinWrite The number of database blocks locked for writing during the sampling interval. PinWaitQ The number of blocks waiting to be locked for writing or reading during the sampling interval. PinWaitRate A calculation: PinWaitQ / (PinWrite + PinRead). TmRead The number of blocks read from disk to system buffer by all uvsh/ user processes during the sampling interval. TmWrite The number of blocks written to disk by all uvsh/user processes during the sampling interval. CmRead The number of blocks read from disk into system buffer by the uvcm/ uvsyncd daemon during the sampling interval. CmWrite The number of blocks written to disk by the uvcm/uvsyncd daemon during the sampling interval. Dirty The blocks of data that have been written to in the system buffer, but that have yet to be written to disk. Neat The unchanged blocks of data in the system buffer. Total A calculation: Dirty + Neat. Hits The number of blocks found in the system buffer (or cache) when read and/or written during the sampling interval. HitRate A calculation: Hits / (PinWrite + PinRead).

50 Latching statistics section

Latching statistics section

This table describes the fields in the Latching Statistics section of the uvsysmon screen. For each of the types shown in the table, the Latching Statistics section shows the wait queue (WaitQ), the number of latches (Latches), the waiting rate (WaitRate), the number of poll calls (PollCall), and the polling rate (PollRate).

Field name Description Big How many times the system accesses the BIG in the system buffer. If the WaitRate is higher than 5 percent or the PollRate is higher than 2 per- cent, increase the number of block index groups by altering the N_BIG parameter. Aft How many times the active file table in the system buffer is locked or unlocked. This should remain at zero approximately 90 percent of the time. If it does not, increase N_AFT_SECTION. Aimg How many times the database locks or unlocks the after-image log buffer. If the WaitRate is higher than 5 percent or the PollRate is high- er than 2 percent, increase the number of after-image log files in the log configuration table. You can also adjust the minimum number of blocks needed to flush the after-image buffer to the after-image log file by changing the AIMG_MIN_BLKS parameter. If the WaitRate or PollRate for Aimg is low, reduce the number of af- ter-image logs, or add a disk and distribute the log files between disks to improve system performance. If the GRPCMT_TIME parameter is zero, there is no group commit in ef- fect. Each write operation waits until the corresponding after-image record is written to disk, which hampers system performance. Bimg How many times the system locks or unlocks the before-image log buffer. If the WaitRate is higher than 5 percent or the PollRate is higher than 2 percent, increase the number of before-image log files in the log configuration table. You can also adjust the minimum number of blocks needed to flush the before-image buffer to the before-image log file by changing the BIMG_MIN_BLKS parameter.

Tip: If the WaitRate or PollRate for Bimg is low, reduce the number of af- ter-image logs, or add a disk and distribute the log files between disks to improve system performance. tm status section

This table describes the fields in the tm status section of the uvsysmon screen:

Field name Description Tm# The number of uvsh/user processes present in the system. Req# The number of requests the database sent to the uvsh/user processes during the sampling interval. ActTm The number of uvsh/user processes active in the system during the sampling interval.

SHM info section

This table describes the fields in the SHM Info section of the uvsysmon screen:

51 Chapter 6: Monitoring and tuning RFS

Field name Description ShmPV Number of system semaphore locks requesting shared memory during the sampling interval. Total Number of system semaphore locks requesting shared memory accumulated since uvsysmon started.

Log file statistics section

This table describes the fields in the Log File Statistics section of the uvsysmon screen:

Field name Description TmBimgFlush The number of times that uvsh/user processes flush to before-image logs during the sampling interval. TmAimgFlush The number of times that uvsh/user processes flush to after-image logs during the sampling interval. CmBimgFlush The number of times that uvcm/uvsyncd daemons flush to before- image logs during the sampling interval. CmAimgFlush The number of times that uvcm/uvsyncd daemons flush to after- image logs during the sampling interval. WaitQ0 The number of uvsh/user processes waiting in the queue during the sampling interval. WaitQ1 The number of uvsh/user processes waiting in the queue during the sampling interval. WaitQ2 The number of uvsh/user processes waiting in the queue during the sampling interval. WaitQ3 The number of uvsh/user processes waiting in the queue during the sampling interval. LogCkSuccess The number of log files that passed checking during the sampling interval. LogCkFail The number of log files that failed checking during the sampling interval. LogOvrflos The number of log file overflow events accumulated since the system started. A log overflow event means a log file has reached 80% full. LogSwitchd The number of log file switching events accumulated since the system started (number of checkpoints). BimgRawBlks The number of blocks written by the before-image log process/daemon during the sampling interval. AimgRawBlks The number of blocks written by the uvaimglog daemon/processes during the sampling interval. TotRaw The sum of AimgRawBlks and BimgRawBlks values.

Lower portion of log file statistics section

The lower portion of log file statistics section at the lower left corner of the uvsysmon screen provides information about individual log files. These are displayed by uvsysmon in groups of four in the current active log set. So, if you have eight log files, the system displays them in groups of four starting with the first log file (0, 1, 2, 3). Then, depending on when you have set your checkpoint, the system switches to the second set of four log files (4, 5, 6, 7).

52 Record info section

Note: This scenario is based on a simplistic system. In a more complex case, if you have two sets, one with four after-image logs and three before-image logs, you will see only the first four in the current set. This is due to the way the log configuration reads the files. Therefore, you may view the four after-image log files for the first set and then the four after-image log files in the second set. You will not be able to view the before-image log files in this case.

These log files have characteristics described in the following table:

Column heading Description LogID The identifying label for the log file. Total The number of blocks written to the log file since the last checkpoint in- terval. You set the checkpoint interval with the CHKPNT_TIME parame- ter. See RFS configuration parameters, on page 72. Length The number of blocks written to the log file during the sampling interval.

Record info section

This table describes the fields in the Record Info section of the uvsysmon screen:

Field name Description RecRead Number of records read during sampling interval. RecWrite Number of records written during sampling interval. RecDelete Number of records deleted during sampling interval. AvgRead Average number of bytes per record read during sampling interval. AvgWrite Average number of bytes per record written during sampling interval.

Trans info section

This table describes the fields in the Trans Info (transaction information) section of the uvsysmon screen:

Field name Description Committed Number of transactions committed to disk during sampling interval. Aborted Number of transactions aborted during sampling interval.

Performance tips

The following information can help you tune your system for performance. The more frequently you write to disk, the slower the performance. You need to tailor RFS for the type of business you have. For example, accounting applications might need more frequent writes than real estate applications. View your system using uvsysmon, then use the following guidelines: ▪ If possible, use raw disk files as the after-image and before-image log files. Rocket recommends that you store these files on a separate physical device from your data. Ideally, each individual after or before-image log should be placed on its own disk to reduce overhead. ▪ Look at the Hitrate of the system buffer. If it is less than 90 percent, increase the number of system buffer pages. You may do this by increasing the N_PUT parameter in the uvconfig file.

53 Chapter 6: Monitoring and tuning RFS

▪ Look at the Latching-Related information. You can see the collision of various parts of the Latching operation. ▫ If the WaitRate on Big is higher than 5 percent, or the PollRate on Big is higher than 2 percent, then you can increase the number of entries as Big. Do this by making the RFS system tunable parameter N_BIG bigger. Also make sure that N_BIG is a prime number. ▫ If the WaitQ and the WaitRate are not zero, increase the N_AFT_SECTION parameter to reduce the conflict of simultaneously trying to access the same AFT section. ▫ If the WaitRate on Aimg is higher than 5 percent or the PollRate on Aimg is higher than 2 percent, increase the number of after-image log files. Then change the parameters N_AIMG and AIMG_MIN_BLKS to a larger corresponding number. ▫ If the WaitRate on Bimg is higher than 5 percent or the PollRate on Bimg is higher than 2 percent, increase the number of after-image log files. Then change the parameters N_BIMG and BIMG_MIN_BLKS to a larger corresponding number. ▪ Look at the === SHM INFO === field to see if shared memory getting and freeing buffer are in normal state. If the ShmPV is frequently nonzero, tune the parameters AVG_TUPLE_LEN and MIN_MEMORY_TEMP to larger values. ▪ Look at the log file related information to check the overflow state of the log file and verify that the space for the before image/after-image log buffer is adequate. ▫ If the LogOvrflos is not zero, that means one or more of the log files is in overflow. Check the LogID file to see which log file is currently in an overflow state. Enlarge the overflowed log file (or reduce the checkpoint time if the log file space is not enough). Also, increase the size of the after-image/before-image log files and modify the logconfig table accordingly. ▪ Increase the priority of uvcm and uvarchive. If possible, set them to real-time priority. ▪ If most of the records in a file are less than 1000 bytes, make your file block size 1K. ▪ The CHKPNT_TIME should be 300 (seconds) or larger. If the checkpoint arrives and the Total in log file related information is significantly smaller then the size of the log file size, reduce the space of the log file. ▪ GRPCMT_TIME allows you to decrease the I/O to the after-image log files by having them written at intervals, rather than with each record. If the GRPCMT_TIME is 0, then it will have constant I/O; and therefore, system performance degradation. Any integer greater than zero increases the number of records per physical write.

Tuning the N_PUT and N_BIG configuration parameters

The N_PUT parameter describes the total number of pages in the system buffer, and the page size is defined by the SB_PAGE_SZ uvconfig parameter. The N_BIG parameter acts as an index to N_PUT. If you increase N_PUT, you should also increase N_BIG. If N_BIG is too large, the number of semaphore operations will increase since each BIG has a semaphore control. This may also increase page swapping. If N_BIG is too small, there may be a lot of contention between database processes. The value of N_BIG should be the closest prime number to NUSERS * 5. It must be smaller than N_PUT.

Note: N_BIG must be a prime number. If it is not, you may experience poor performance.

54 Adjusting the log files

Adjusting the log files

If uvsysmon is reporting log switches more frequently than checkpoint intervals, you may want to increase the size of your log files. If the log files switch before a checkpoint occurs, the log files have reached the 80 percent full mark before a checkpoint has taken place. It is better to increase the size of the log files than it is to have a large number of logs. Although there is no exact formula for determining the size of the log files, a starting place is to multiply the number of records expected for update during one checkpoint interval by the largest record size. Divide this number by the block size you have chosen in the log configuration table, and you have an estimation for the number of blocks needed for your log files. Increase the log length parameter in the logconfig file and run uvcntl_install. See Logging, on page 11.

Note: When changing the size of your log files, change the log length parameter and not the block size in the logconfig file located in the uvhome directory. The block size cannot exceed 16,384. In UNIX, you should always use the UNIX file system block size. If you cannot determine the UNIX file system block size on your system, use 4096.

Adjusting the archive files

One archive file should be at least as large as one full set of log files. This is to ensure that one checkpoint does not span multiple archive files. If the archive files are too small, you will have to be off-load frequently as they fill. If the archive files are too large, the time to off-load the files may be unacceptable. Although the default for the number of archive files is 2, you may want to consider having more archive files.

Note: When the full set of archive files are full, database processing will pause until the archive files have been off-loaded.

UniVerse Active File Table (AFT)

The database uses tables in memory to keep track of RFS files. The active file table (AFT) contains an entry for every unique file and tracks the number of files that are open system-wide. The size of the AFT is controlled by the database configuration parameter N_AFT, which has a default value of 5000. Dynamic files consume two slots and each file index consumes one slot. The AFT can become quite large when applications have many files open at once. You can configure the size of the table by adjusting the database configuration parameters. To view the values of these parameters, use the smat -t command. For detailed instructions on changing these parameters, refer to Modifying uvconfig parameters. To view the contents of the AFT, you can use the uvunload -listall command. This command will display each file currently loaded in the table, the number of times the file is opened, and the slot number where the file is located in the table.

Note: Use the uvsysmon utility to tune the AFT for performance purposes. The output section of the utility, called LATCHING STATISTICS, includes a WaitRate parameter for the AFT. If this parameter approaches 100%, the N-AFT parameter should be increased. If you see a sudden increase in WaitRate, this tuning can improve performance even if the value of WaitRate never exceeds 10%.

55 Chapter 6: Monitoring and tuning RFS

AFT sections

The system-wide AFT is divided into sections and defined by the N_AFT_SECTION uvconfig parameter. Dividing the AFT into multiple sections reduces wait times because it makes it possible for different uvsh/user processes to search in different sections at the same time. It also reduces the number of entries it needs to search through. AFT entries are hashed across the sections, similar to the way records in a database hashed file are hashed into groups. Each AFT section can be searched by only one uvsh/user process at a time as the section is locked during the search. The default value of the N_AFT_SECTION parameter is 17. To increase the number of sections, access the parameter in the uvconfig file. The new value must be greater than 0, less than N_AFT/2, and be a prime number. It is recommended that you increase this number in small increments and review the results using the uvsysmon utility.

AFT hash buckets

Each AFT section contains a set of hash buckets and each AFT entry is hashed into one of the hash buckets. The two levels of hashing makes the search process faster because a given uvsh/user process does not have to read all of the entries in the table (or section) to find a particular entry.

The number of hash buckets can be configured with the N_AFT_SECTION_BUCKET uvconfig parameter. The default value for this parameter is 23. This parameter must be a prime number and greater than the value of the N_AFT_SECTION parameter. The average number of files to be loaded into a hash bucket in the AFT can be calculated using the following formula: (N_AFT/N_AFT_SECTION)/N_AFT_SECTION_BUCKET. If the default uvconfig settings are used, this formula, (5000/17)/23, results in approximately 13 files per hash bucket. When an AFT section is full, files are removed after the following conditions are met: ▪ The system buffer does not contain modified file pages. ▪ A checkpoint is complete and all pages in the system buffer are marked as clean or unmodified. If the N_AFT parameter is increased, consider increasing the values of the N_AFT_SECTION and N_AFT_SECTION_BUCKET parameters to keep the average number of files in a hash bucket to a relatively low number for performance purposes. The following conceptual diagram illustrates how an AFT and AFT sections could be configured based on the N_AFT_SECTION and N_AFT_SECTION _BUCKET parameters:

56 UniVerse session hash buckets

UniVerse session hash buckets

Each UniVerse session has an active file table divided into hash buckets, just like the system-wide AFT. This table can be used to improve AFT table searching when a UniVerse session has a file opened in the system buffer. You can change the number of hash buckets per session by increasing the value of the SESSION_AFT_BUCKET parameter in the uvconfig file. This number must be greater than 0 and should be a prime number.

UniVerse session open file limit

Starting at UniVerse v12.1.1, each session is limited in the number of logical files the application can open. When this limit is reached, attempts to open additional files will fail. This hard limit is based on the number of logical files that can be opened by the application and is unaffected by the rotating file pool. The default value of the number of logical files open per session is based on a multiple of the current MFILES setting, which can be set using the SESSION_NFILES uvconfig parameter. Refer to Modifying uvconfig parameters for additional information. Sizing shared memory segments

The use of shared memory segments is new in UniVerse 12.1. The following sections describe how to determine the required shared memory segment sizes for the system buffer, the system control, and the global lock manager. These memory segments must be sized appropriately regardless of whether RFS is enabled or disabled.

Calculating the system buffer's shared memory segment size

One shared memory segment is used for the system buffers. In previous versions of UniVerse, each user process loaded the data they needed into their own private memory segment. In UniVerse 12.1, the data is loaded into a shared memory buffer that is accessible by all users. This allows another user who needs the same data to read it from the buffer rather than from disk, alleviating the need to store the data multiple times in memory. The configuration of the size of the system buffer's shared memory segment is vital for performance. If the segment is too small, there will not be enough space to store the necessary data, and UniVerse will spend a lot of time reading and writing data back and forth to disk. Conversely, if the segment is too big, there will not be enough room for the operating system to allocate memory for all other processes, and it will swap pages unnecessarily. This memory segment should be sized based on the amount of data that needs to be in memory on a regular basis for all processes to function. To calculate the system buffer's shared memory segment size, you must configure two parameters in the uvconfig file, which is located in uvhome. The calculation is: N_PUT * SB_PAGE_SZ For descriptions of these parameters see N_PUT, on page 78 and SB_PAGE_SZ, on page 79.

57 Chapter 6: Monitoring and tuning RFS

Calculating the system control shared memory segment size

The system control segment, sysctl, is another shared memory segment that was added to UniVerse 12.1. This segment maintains the active file table in addition to the before and after-image logs for RFS. Even if you do not enable RFS, you must still insure that the sysctl segment is properly sized for your active file table.

Configuration without RFS If you have not enabled RFS, you can determine the system control segment size by configuring the N_AFT uvconfig parameter. N_AFT specifies the maximum number of files that can be opened at the same time. Each active file entry takes slightly more than ½ KB.

Without RFS enabled, the sysctl segment size calculation is: N_AFT * .54 For more information about the N_AFT parameter, see N_AFT, on page 61.

Configuration with RFS

With RFS enabled, the sysctl also includes space for the before and after-image transactions: (N_BIMG * BIMG_BUFSZ / 1024) + (N_AIMG * AIMG_BUFSZ / 1024) The N_BIMG and N_AIMG should be adjusted to account for the maximum number and size of transactions that occur within the amount of time between snapshots. For more information about these parameters, see N_BIMG, on page 78, BIMG_BUFSZ, on page 75, N_AIMG, on page 77, and AIMG_BUFSZ, on page 73.

Calculating the global lock manager shared memory segment size

The final shared memory segment that needs to be considered is the global lock manager segment, glm. As a sizing estimate, a default 4MB segment should handle about 10,000 locks. Using the GLM_MEM_SEGSZ uvconfig parameter, the calculation of this segment is GLM_MEM_SEGSZ / 1024. When this segment fills up, another segment of the same size is created. This process continues until 16 segments are created. For more information about GLM_MEM_SEGSZ, see GLM_MEM_SEGSZ, on page 75. Tuning system parameters for the uvdb user

Starting in UniVerse 12.1, each user connected to the database has an additional process that helps facilitate runtime operations. This process runs as the uvdb user, and tunable system parameters must be configured to handle it. To check the maximum number of processes allowed per user on AIX, enter the following command:

lsattr -El sys0 -a maxuproc To change this value, enter the following command:

chdev -l sys0 maxuproc='newvalue'

58 Chapter 7: Troubleshooting RFS

This chapter contains sample error messages and possible reasons for the error messages that you may encounter when running RFS. Failure of UniVerse to start

A variety of circumstances may prevent the database from starting. When you invoke the uv -admin -start command and the database fails to start, you may see messages similar to the following examples:

# uv -admin -start Starting UniVerse... Couldn't start UVSMM. Please check uvhome/uvsmm.errlog and uvhome/uvsmm.log UniVerse is NOT started.

# cat uvsmm.errlog Checking "uvhome/acct_licn.def" ...... No account-based license definition found, using default configuration.

Logs are too small

Error messages are written to the uvhome/uvsmm.errlog file when the file-level log is too small, and when the bimglog and aimglog files are too small.

The file-level log is too small The size of the file-level log file must be at least NUSERS + 1. If this is not the case, an error message similar to the following is written to the uvhome/uvsmm.errlog file:

N_AFT_BUCKET should be a prime multiple of N_AFT_SECTION (7); using 119. Warning: Your AFT MLF buckets are large; this slows opens of multilevel files, and may slow down crash and media recovery. ERROR in function U_s_setupsca(), file s_setsca.c:

file log too small — called by process 62969 File log size should be at least 257 blocks. Please change the file log size and run uvcntl_install, then startup system.

To correct this, increase the value of the file-level log size in the logconfig file, run uvcntl_install, and start the database.

Note: The database overwrites the uvsm.log file every time you execute uv -admin -start. If you execute this command more than once, check the uvsm.log file located in the uvhome/ saved_logs directory, which includes the last twenty uvsm.log files.

59 Chapter 7: Troubleshooting RFS

Bimglog and aimglog files are too small

If the bimglog and aimglog files defined in the logconfig file are not large enough for your application and load, a checkpoint will occur when the log files are 80 percent full, regardless of the checkpoint time defined in uvconfig. A message will appear in uvhome/uvsm.log similar to the following:

# cat uvsm.log Checking log files ..... ----- SM (19629) is started at Dec 18 2017 14:56:07 -----

UniVerse Environment : uvhome

Mon Dec 18 14:56:07 SM: Restart_Flag = 0. CM, Mon Dec 18 14:57:14 2017:WARNING: The log file [uvhome/rfslog/aimgbimg/b_0000] is too small to contain the log records.Please enlarge the log files. CM, Mon Dec 18 14:57:15 2017:WARNING: The log file [uvhome/rfslog/aimgbimg/b_0002] is too small to contain the log records.Please enlarge the log files. CM, Mon Dec 18 14:57:16 2017:WARNING: The log file [uvhome/rfslog/aimgbimg/b_0000] is too small to contain the log records.Please enlarge the log files. CM, Mon Dec 18 14:57:18 2017:WARNING: The log file [uvhome/rfslog/aimgbimg/b_0002]

The database should not encounter any error conditions if the log files are too small, but checkpoints will occur frequently, causing system performance to degrade.

Stop the database and increase the size of the log files in the logconfig file. Run uvcntl_install and start the database using the uv -admin -start command.

Note: Rocket recommends that you increase the size of the log files by increasing the log length parameter in the file, rather than increasing the block size. In UNIX, make sure the block size corresponds to the UNIX file system block size, or 4096 if you do not know the UNIX file system block size.

UniVerse daemon killed

If one of the daemons is killed during processing, the database will shut down. The database writes a message similar to the following example to the uvhome/uvsm.log:

Mon Dec 18 11:38:50 SM: Restart_Flag = 0. Mon Dec 18 11:39:29 SM checked: cm (pid = 17712): Mon Dec 18 11:39:29 Stopped because of Mon Dec 18 11:39:29 KillMon Dec 18 11:39:29 Mon Dec 18 11:39:29 ----- System Crashed at Dec 18 2017 11:39:29 ----- Mon Dec 18 11:39:29 All possible CM & TMs & AIMGLOGs & BIMGLOGs killed Dumping the system buffer to "uvhome/rfs.dump/rfs.dump"...... Done. RM has already been stopped.

Must be Superuser to stop uvcleanupd Daemon Stopping UniVerse with force option... RM has been stopped. Uvcleanupd has been stopped. Unirpcd has been stopped. Mon Dec 18 11:39:34 ----- SM (force) Shutdown at Dec 18 2017 11:39:34 ----- SM stopped successfully. UVSMM has been stopped. UniVerse has been brought down. #

60 Archive files are full

Execute the showuv command to make sure all the remaining daemons are stopped. If they are not, stop the database with the uv -admin -stop -force command to force the remaining daemons to shutdown. Check the error logs in uvhome/saved_logs to see if the database detected any circumstances which may have caused the process to be killed, and resolve any errors. Then restart the database. Archive files are full

If you are running archiving as part of RFS and the database appears to hang, it may be that the archive files are full and the database is waiting for the full archive to be off-loaded. A message appears on the system console, in the window where the database was started, and in the uvhome/ uvsm.log, indicating that the archive file is full and must be off-loaded. Contact your system administrator to copy the archive to the appropriate place.

Note: Rocket recommends that you use the uvarch_backup script to automatically off-load full archive files to tape or to disk to ensure that the system does not appear to hang during normal processing. One archive file must be at least as large as one full set of log files so that one checkpoint does not cross multiple archive files.

Parameter limits exceeded

A variety of conditions can cause a UniBasic program to report a run-time error or execute an ELSE clause in an open file statement. If the UniBasic program is performing an operation against a database file, you may have exceeded a parameter defined in uvhome/uvconfig.

MAX_OPEN_FILE

The MAX_OPEN_FILE parameter is used by UniBasic as the maximum number of open files allowed per process for all types of hashed files, including recoverable and nonrecoverable dynamic and static files. If this limit is exceeded, the database displays a run-time error message (too many open files at line nn).

N_AFT

The N_AFT parameter is the system-wide limit on the number of different database files and their indexes that can be opened at one time. This is the number of slots in the system buffer's AFT (active file table). A dynamic file takes up two entries. Each index of data file takes up one entry. If more than one user opens the same file, it is only counted once.

Note: If you exceed this limit, a UniBasic program will execute the ELSE clause in an open file statement.

61 Appendix A: RFS commands and daemons

This appendix describes the commands and the daemons that are associated with RFS.

Note: Throughout this document, uvhome is used to indicate the location where UniVerse is installed.

RFS commands

This section describe the commands that are used by RFS.

uvcntl_install

The uvcntl_install command is a database system-level command that initializes log files and archive files and re-initializes the uvrfs.control.file, system.status, restart.fileend, and restart.newblk files. These files are located in the uvhome directory.

Syntax

uvcntl_install [-forcerestart]

Parameter

The following table describes the forcerestart parameter:

Parameter Description -forcerestart Prompts if you want to continue restarting the database and attempts to open the system.status file. If the database cannot open this file, it tries to create a new one.

If the status in the system.status file reports the system is already in system recovery mode, the database returns a message similar to: System is already in crash recovery status (status). You might want to remove the system.status file and rerun uvcntl_install -forcerestart. If the status in the system.status file reports an unrecognized code, the database returns a message similar to System is in unknown status (status), will be forced to recovery mode.

Example

The following example shows output from the uvcntl_install command:

# uvcntl_install uvcntl_install utility resets UniVerse System after a full database backup (Image Copy). This means, all log (and archive) files will also be initialized for re-use. Do you want to continue?(y/n) [n] y Installing Logs (and Archives) after uvcntl_install

62 uvrfs.control.file

......

Caveats

Be aware of the following before you execute the uvcntl_install command: ▪ You must log on as root/administrator to execute uvcntl_install. ▪ The database must not be running. ▪ You should run the command only after you have stopped the database and created and verified a full backup to keep your backups and logs synchronized.

▪ Do not run uvcntl_install if you used the SUSPEND.FILES ON or uv -admin -L command prior to creating your backup. ▪ If you choose to install the database with RFS, the installation process automatically runs uvcntl_install. Do not use uvcntl_install immediately after a crash, because the command overwrites the log and archive files, which prohibits you from recovering from a crash. Use the command only after all required recovery is complete.

Note: Relative paths are not allowed in the logconfig or archconfig files. If uvcntl_install detects a relative path, it fails and does not initialize the logs and archives. The database displays an error message indicating that relative paths are not allowed. If this occurs, you can edit the configuration files and re-execute uvcntl_install. More information on logging and archiving is provided in Logging, on page 11 and in Archiving, on page 19. uvrfs.control.file

When you install the database, the system automatically creates the binary data file uvrfs.control.file, which contains information including locations for log and archive files (if archiving is turned on). This file also tracks the logical sequence number the database assigns to archive files as they fill.

If you stop the database to perform a full backup, you should execute uvcntl_install to reinitialize uvrfs.control.file. If you use the SUSPEND.FILES ON or the uv - admin -L command prior to performing a full backup, you do not need to reinitialize the uvrfs.control.file by executing uvcntl_install.

Warning: The uvrfs.control.file is vital to the successful operation of RFS. Do not delete or edit this file.

The uvrfs.control.file is located in uvhome. system.status file

The system.status file registers the current status of RFS and cannot be removed.

Warning: If this file is deleted, the database continues to run, but you cannot start the database if it stops for any reason until you either restore the system.status file from tape or execute uvcntl_install to re-create the file. In either case, you can only recover to the last completed archive.

The system.status file is located in the uvhome directory.

63 Appendix A: RFS commands and daemons

restart.newblk file

The restart.newblk file is used internally by the database when performing crash recovery, and cannot be removed.

Warning: If this file is deleted, crash recovery will fail when you start the database.

The restart.newblk file is located in the uvhome directory.

restart.fileend file

The restart.fileend file is used internally by the database when performing crash recovery, and cannot be removed.

Warning: If this file is deleted, crash recovery fails when you start the database.

The restart.fileend file is located in the uvhome directory. uvforcecp

The system-level uvforcecp command allows you to force a checkpoint for your system. This allows you to create a forced checkpoint to register your system at a given point in time.

You can execute the uvforcecp command directly from the uvhome/bin directory.

Syntax

uvforcecp uv -admin -mediarec

The uv -admin -mediarec command restores changes to database files by applying archives since the last backup.

Conditions

To use the uv -admin -mediarec command, your system must meet the following conditions: ▪ You must have a full backup of your system. ▪ You must have had archiving turned on since that backup. ▪ You must have saved your archive sets when they filled. ▪ The uvhome directory must exist.

Syntax

uv -admin -mediarec [-s [MM:DD:YY:]HH:MM[:SS]] [-e [MM:DD:YY:] HH:MM [:SS]] [-f path/filename] [-T start_LSN[,end_LSN]]

Parameters The following table describes each parameter of the syntax:

64 uv -admin -mediarec

Parameter Description -s Specifies the recovery start time. If you do not use the -s option, the whole archive set (from the last backup to current) is recovered. -e Specifies the recovery end time. If you do not use the -e option, the whole archive set (from the last backup to current) is recovered. -f Specifies a file that contains a list of files (one path and file name per line) to recover. If you do not use the -f option, the uv -admin -mediarec command recovers all files. -T Specifies the starting logical sequence number (LSN) and the ending LSN for media recovery. If you only specify the starting LSN, the uv -admin -mediarec command will prompt for the next sequential LSN.

Examples

In the following example, the uv -admin -mediarec command restores a database:

# bin/uv -admin -mediarec -T0

For media recovery, you would be required to have space for two temporary files, one to hold the largest archive file and another to hold the largest CP size. Please more information, please read documentation about media recovery procedure and re-start media recovery.

Max CP Size (in bytes): 316416 Max Arch File Size (in bytes): 419430400

Also, if you're planning to use the tape(s) created by archive process, please setup restore script uvhome/bin/arch_restore properly (tape device) and load the first archive tape.

Do you want to continue (y/n) [n]? y

Starting uvsmm... Starting uvsm...(option: -m -T 0)

Please check the mediarec results in uvhome/uvsm.log

Stopping uvsmm...

cat uvsm.log rst_activate_key() called with empty implementation

For media recovery, you'll be asked to upload archive files one by one by sequence number into the uvhome/rfslog/mediarec_tmp/ARCH file. media_file_recovery_repstyle() : op=CREATE.FILE, cwd=/home/jjones/testarch, started: Creating file "BB" as Type 3, Modulo 7, Separation 2. Creating file "D_BB" as Type 3, Modulo 1, Separation 2. Added "@ID", the default record for RetrieVe, to "D_BB". media_file_recovery_repstyle() : op=CREATE.FILE, done ret=1...... media_file_recovery_repstyle() : op=CNAME, cwd=/home/jjones/testarch, started: Changed operating system file name from "BB" to "BBBB". Changed operating system file name from "D_BB" to "D_BBBB". Changed "BB" to "BBBB" in your VOC file. media_file_recovery_repstyle() : op=CNAME, done ret=1.

65 Appendix A: RFS commands and daemons

...... media_file_recovery_repstyle() : op=DELETE.FILE, cwd=/home/jjones/testarch, started: DELETEd file "D_TEST", Type 3, Modulo 1. Field 3 in file definition record "TEST" has been set to nothing. media_file_recovery_repstyle() : op=DELETE.FILE, done ret=1. media_file_recovery_repstyle() : op=DELETE.FILE, cwd=/home/jjones/testarch, started: DELETEd file "TEST", Type 7, Modulo 13. DELETEd file definition record "TEST" in the VOC file. media_file_recovery_repstyle() : op=DELETE.FILE, done ret=1.

Total number of checkpoints : 1 Total redo log records : 1849 Update operations : 1841 Create.file : 1 Delete.file : 2 Cname.file : 1

****!!! Media Recovery Finished !!!**** SM stopped successfully. uv -admin -start

The UniVerse system-level command uv -admin -start starts the database background daemons, including the uvsm daemon, and re-initializes the uvhome/uvsm.log file. If RFS is enabled, this command also starts RFS and automatically recovers the database files if a system crash occurs.

Note: The last twenty uvsm.log files are appended in the uvsm.log file located in uvhome/ saved_logs.

Syntax

uv -admin -start [-init] [-writethrough]

Parameters

The following table describes the other uv command's options, in addition to -start, that affect RFS:

Option Description None Starts all UniVerse processes in the correct order, checks to see whether a system crash occurred, and automatically performs crash recovery if needed. -init Starts the replication manager daemon (uvrepmanager) with the -i option. When RFS is enabled, this option bypasses the automated crash recovery sequence and initializes RFS status.

66 uvunload

Option Description -writethrough Starts the RFS system buffer in writethrough mode even if RFS is enabled. (RFS is enabled by having the RFS_MODE uvconfig parameter set to 1.) Normally, when RFS is disabled, the system buffer is in writethrough mode and when it is enabled, the writethrough mode is off. When writethrough mode is enabled, updates are written directly to disk rather than by the RFS daemons.

No messages are displayed on the screen if crash recovery is complete. Check uvhome/uvsm.log for information, especially if you notice anything unusual about the startup.

Note: Do not put uv -admin -start in your boot startup script, because you will not be able to control crash recovery operations.

Examples

In the following examples, the uv -admin -start command starts the database and RFS:

# bin/uv -admin -start Starting UniVerse... UVSMM is started. Starting uvsm... Unirpcd is started. Uvcleanupd is started. UniVerse has been brought up. uvunload

The UniVerse system-level command uvunload unloads data files from the system buffer so you can perform OS-level commands, such as rm, mv, etc. The user must have root privileges (“root” in Unix or System Administrator in Windows) to execute this command.

Conditions

Before performing the specified action on the file, uvunload will open the file for exclusive access. If an exclusive open lock cannot be obtained, the command will exit without processing.

When more than one file is being processed by uvunload, the following conditions apply: ▪ When unloading multiple files from the system buffer, the default action is to stop processing when an exclusive open lock cannot be obtained on a file in the list. The use of the force option will result in that file being skipped and processing will continue with the next file in the list.

▪ When using the remove action, the force or atomic options can be specified. The default behavior is to stop when a file cannot be opened for exclusive access. The force option allows for that file to be skipped and processing continues with the next file. The atomic option will ensure either all files in the list are removed, or none are removed.

▪ For all other actions, uvunload behaves as if the atomic option was specified. The command will attempt to lock every file in the list before performing the action on any file. If an exclusive open lock cannot be obtained on any file in the list, the command will exit without processing any file.

67 Appendix A: RFS commands and daemons

Syntax uvunload [--unloadall | --listall | --checkall | --remove | --move | --oscommand string] [--force | --atomic] [--recursive] [--silent | --verbose] […] The following table describes each parameter of the syntax.

Parameter Description No Action Unloads the specified files from the system buffer. If more than one file is specified and some files cannot be opened exclusively, the command will stop and additional files will not be processed. If only one directory is specified, the --recursive option is used, which results in more than one file being processed. For exceptions, use the --force option. --unloadall No file name needed is for this option, as it unloads all files in the system buffer. System Administrator privileges are required to complete this action. --listall No file name needed for this option, as it lists all files in the system buffer using the AFT information. --checkall No file name needed for this option. Checks the validity of all files in the system buffer. System Administrator privileges are required for this action. For every entry in the AFT, the file name will be used to obtain the status of the file in the OS file system. The i-number and d-number from this status is compared to the i-number and d-number values stored in the AFT. If they match, the AFT entry is valid and the Exit Code is set to 0. Otherwise, the AFT entry is invalid and the Exit Code is set to 3. --remove Unloads the specified files from the system buffer and removes the files from the OS-file system. Internally, the OS rm command is used. If more than one file is specified and some files cannot be exclusively opened, the command will stop and additional files will not be processed. If only one directory is specified, the --recursive option is used, which results in more than one file being processed. For exceptions, use the --force or --atomic options. --move Unloads the specified files from the system buffer and moves at least two files at the OS-file system. If there are more than two file names, the last file must be a directory. Internally, the OS mv command is used. If more than one file is specified, the move action will be executed only after all specified files are successfully unloaded. --force This option is used with the remove action or the default action. If this option is specified, the operation will skip any file on which it cannot be exclusively opened and continue to process the next file. --atomic This option is only used with the remove action. If this option is specified, this command will ensure all files specified are either removed altogether or none of the files are removed. This action is achieved by locking all specified files before any file is removed. Note that if using Windows, the process of physically removing a file might fail if the file is opened by a non-UniVerse process. --oscommand Unloads the specified files from the system buffer and executes the user provided OS string command. If more than one file is specified, the OS command will only be executed after all specified files are successfully unloaded. --recursive Recursively applies the command on the subdirectory. --silent Keeps silent. Will not report error messages or summary messages. --verbose Reports detailed messages. Without this option, only summary messages will be printed at the end of the command. filename OS file name to which the operation will be acted on.

68 RFS daemons

Parameter Description @filelist The OS filelist is the name of a file containing a list of file names. The file list is in a format. Each line is a single file.

Command Exit Code: ▪ 0: Successful. If the command is performed on more than one file, this code indicates that the command completed successfully on all files. A failure on any file will result in non-successful result code. ▪ 1: Failed because the process failed to obtain an exclusive open on the file. ▪ 2: Failed while running the OS command. ▪ 3: Failed due to other errors.

Examples

The following command flushes the ORDERS file from the system buffer, saves the file as ORDERS.save, and releases the exclusive lock on the file:

uvunload –oscommand “cp ORDERS ORDERS.save” ORDERS

The following command unloads three files (CUSTOMER, ORDERS, and PARTS) from the system buffer:

$ cat mylist CUSTOMER ORDERS PARTS $ uvunload @mylist

RFS daemons

This section describes the daemons that are used by RFS. uvsm

The uv -admin -start command invokes the uvsm (system monitor) daemon. uvsm checks the integrity of your database installation—in other words, uvsm examines log files upon startup for evidence of a crash, then starts the appropriate processes as needed to restore integrity. uvsm also monitors processing for certain error conditions, such as loss or unavailability of the disk containing before or after-image logs, and stops the database in as controlled a manner as possible if these conditions are detected. uvsm is the parent of the processes described in the following table:

Process Description restart Crash recovery process. When you start the database, uvsm checks for a system crash. If it detects one, uvsm starts this process to determine the current log set and applies before and after image logs to your files. The uv -admin -mediarec command also uses this process, with different parameters, to restore from archives.

69 Appendix A: RFS commands and daemons

Process Description uvcm Checkpoint manager. This process writes “dirty pages” (changed records) to disk, switches log sets between active and inactive status, and archives after-image logs. It performs these steps according to a user-defined checkpoint interval, or as required by the system. uvarchive Archive process. This process writes after-image log sets to archive files, marks log sets as archived and notifies the user if archives fill. You can save filled archives to tape automatically. uvbimglog The before-image log process. The uvsh/user process writes before- images of blocks to the shared memory buffer for this process, and uvbimglog then writes the blocks to the before-image log files on disk. uvaimglog The after-image log process. The uvsh/user process writes changes to records into the shared memory buffer for this process, and uvaimglog then writes the records to the after-image log files on disk. uvar_backupd The archive backup daemon. Exists only if you turn on automated backup of archive files. sync If you notice significant performance degradation during a checkpoint, you can start syncing daemons, which periodically flush updated pages from the system buffer to the log files. uvsh/user

Each database session has an associated uvsh/user process, a transaction manager with root privileges.

The uvsh/user processes access recoverable files through the system buffer, which is organized into pages based on the SB_PAGE_SZ uvconfig parameter. They also perform all user access to recoverable files and log processes.

If you have enabled transaction processing, all transaction semantics (such as ABORT and COMMIT) are also handled by uvsh/user. Each uvsh/user process runs with root privileges, so users other than root cannot kill a uvsh/user. If a uvsh/user process dies, uvsm detects this and stops the database to preserve its integrity. uvbimglog

The number of uvbimglog processes matches the number of before-image log files per log set. This matches the value of the configuration parameter N_BIMG. uvaimglog

The number of uvaimglog processes matches the number of after-image log files per log set. This matches the value of the configuration parameter N_AIMG.

70 uvarchive uvarchive

The uvarchive daemon only exists if you have archiving enabled. This process writes the after- image log sets to the archive files, marks the after-image log files as archived, and informs the user if the archives fill. uvar_backupd

The uvar_backupd daemon only exists if you turn on automated backup for your archives. If you have automated backup turned on, uv -start starts uvar_backupd, which then runs continuously, invoking your archive backup script whenever an archive file fills. uvsyncd

If you notice significant performance degradation during a checkpoint, you can start sync daemons by setting the uvconfig parameters N_SYNC and SYNC_TIME. The uvsyncd daemons periodically flush updated pages from the system buffer to the log files, reducing the amount of time it takes to complete a checkpoint.

N_SYNC determines the number of uvsyncd daemons the database starts. SYNC_TIME defines, in seconds, the amount of time the uvsyncd daemons wait before scanning the system buffer for updated pages.

71 Appendix B: RFS configuration parameters

UniVerse RFS configuration parameters are stored in the uvconfig file, which is located in the UV account directory (referred to as uvhome throughout this guide). Many of the RFS configuration parameters have default settings that are likely to be appropriate for your system; however, you can modify them in the uvconfig file as needed. Modifying uvconfig parameters

When you start the database, it reads the contents of the uvconfig file. The parameters direct the database whether to start with RFS running, and determine a number of system-wide configuration settings. Every time a user logs in, the user’s process reads the same uvconfig file. You can use a text editor to change the values of uvconfig configuration parameters.

1. Log on to your system as root. 2. Print a copy of the current uvconfig file to obtain a list of the current settings. 3. Make a backup copy of the uvconfig file. If you run into any problems, you can easily revert to this copy. 4. Review the current settings, and determine which parameters should be changed. 5. Edit the uvconfig file using a text editor. Each line must have the format NAME=value, where NAME is the parameter name and value is the value you want to use. 6. Make sure all users have logged off, and stop the database. 7. If you changed the ARCH_FLAG, N_AIMG, N_BIMG, or N_ARCH parameters in uvconfig, run the uvcntl_install command (located in uvhome/bin) to reinitialize archiving and logging. When you run uvcntl_install at this point, you ensure that your logs, archives, and your configuration parameters are synchronized. If you did not change these parameters, you do not need to run uvcntl_install. 8. Now the parameters you modified in uvconfig are in effect. Start the database with the uv - admin -start command with no options. For best results, you should plan to implement uvconfig changes after a full backup and after running uvcntl_install. If you are turning the ARCH_FLAG parameter on or off, or changing the number of log or archive files, and you do not back up your system, you may not be able to recover from a failure.

Warning: If your system has crashed, do not implement any uvconfig changes until you have completed recovery from the crash. If you start the database with new parameters before recovery is complete, you could prevent recovery for one or more files.

UniVerse RFS parameters

The uvconfig parameters that relate to RFS are listed below: ▪ AIMG_BUFSZ ▪ AIMG_FLUSH_BLKS ▪ AIMG_MIN_BLKS ▪ ARCH_FLAG

72 AIMG_BUFSZ

▪ ARCH_WRITE_SZ ▪ ARCHIVE_BACKUP ▪ BIMG_BUFSZ ▪ BIMG_FLUSH_BLKS ▪ BIMG_MIN_BLKS ▪ CHKPNT_TIME ▪ GLM_MEM_SEGSZ ▪ GRPCMT_TIME ▪ LOG_OVRFLO ▪ N_AFT ▪ N_AFT_SECTION ▪ N_AFT_SECTION_BUCKET ▪ N_AIMG ▪ N_ARCH ▪ N_BIG ▪ N_BIMG ▪ N_PUT ▪ N_SYNC ▪ NSEM_PSET ▪ RFS_DUMP_DIR ▪ RFS_DUMP_HISTORY ▪ RFS_MODE ▪ SB_PAGE_SZ ▪ SESSION_AFT_BUCKET ▪ SESSION_NFILES ▪ SYNC_TIME

AIMG_BUFSZ

The AIMG_BUFSZ parameter specifies the size of the after-image buffer, in bytes. The default is 10,485,760 bytes (10MB). AIMG_BUFSZ cannot exceed the log block size multiplied by the log length. The recommended value is the after-image block size multiplied by 100.

Note: If you are using raw disk for your log files and you need to change the AIMG_BUFSZ parameter, change the value in increments of 4,096 bytes. If your log files are regular UNIX files, make your changes in increments equal to the block size (in bytes) that your file system uses.

Warning: If the block size in your logconfig file exceeds 4096, you must also increase the AIMG_BUFSZ and BIMG_BUFSZ configuration parameters. These parameters must be a multiple of the block size defined in logconfig.

73 Appendix B: RFS configuration parameters

AIMG_FLUSH_BLKS

The AIMG_FLUSH_BLKS parameter specifies the number of blocks in the after-image buffer that the database flushes to the after-image log files at one time. The default setting is 10 blocks.

AIMG_MIN_BLKS

The AIMG_MIN_BLKS parameter specifies the minimum number of blocks required in the after-image buffer before the database flushes the blocks to the after-image log. The default setting is 20 blocks. The size of blocks is defined in the log configuration table.

ARCH_FLAG

The ARCH_FLAG parameter turns the archiving system on or off. Any positive integer turns archiving on. The default setting is 0, meaning off.

Note: For the archiving process to function, you must also complete the steps in Archiving, on page 19.

ARCH_WRITE_SZ

The ARCH_WRITE_SZ parameter specifies the size, in bytes, of blocks for the archive process to write from log files to archive files. The default setting is 0, meaning that after-image log files are written to archive files one block at a time. If this parameter is set to a nonzero value, it must be a multiple of the log/archive block size.

Note: Setting this parameter may improve performance. The performance improvement is platform-specific. Because writing log files to archive files uses memory, the larger you set this parameter, the more memory your system will use.

ARCHIVE_BACKUP

The ARCHIVE_BACKUP parameter turns the automatic backup of archive files on or off. If this parameter is set to 1, the database executes the uvarch_backup script located in uvhome/bin. The default setting is 0, meaning off.

Tip: If the ARCHIVE_BACKUP parameter is on and you want to automatically back up your archive files, make sure the uvarch_backup and uvarch_restore scripts are copying the files to the appropriate place and that the scripts are compatible.

Note: If you are using the archiving system and the ARCHIVE_BACKUP parameter is turned off, you will have to off-load the archive files manually when they fill up.

74 BIMG_BUFSZ

BIMG_BUFSZ

The BIMG_BUFSZ parameter specifies the size of the before-image buffer in bytes. The default setting is 10,485,760 bytes (10MB). BIMG_BUFSZ cannot exceed the log block size multiplied by the log length. The recommended value is the before-image block size multiplied by 100.

Note: If you are using raw disk for your log files and you need to change the BIMG_BUFSZ parameter, change the value in increments of 4,096 bytes. If your log files are regular UNIX files, make your changes in increments equal to the block size (in bytes) of your file system.

Warning: If the block size in your logconfig file exceeds 4096, you must also increase the AIMG_BUFSZ and BIMG_BUFSZ configuration parameters. These parameters must be a multiple of the block size defined in logconfig.

BIMG_FLUSH_BLKS

The BIMG_FLUSH_BLKS parameter specifies the number of blocks in the before-image buffer that are flushed to the before-image log files at one time. The default setting is 10 blocks.

BIMG_MIN_BLKS

The BIMG_MIN_BLKS parameter specifies the minimum number of blocks required in the before-image buffer before the system will flush the blocks to the before-image log. The default setting is 20 blocks. The size of blocks is defined in the log configuration table.

CHKPNT_TIME

The CHKPNT_TIME parameter specifies the checkpoint interval. RFS retains recently requested and recently written file blocks in the system buffer. The database flushes changes in the system buffer to disk each time the checkpoint interval elapses. The database also flushes the buffer if the buffer fills, regardless of timing. The default setting is 300 seconds.

Tip: Check uvhome/uvsm.log frequently when you first turn on logging. If the log files overflow, you will see messages in uvsm.log. In that case, increase the log file size or decrease the checkpoint interval. If you need to change this value, try changing it in 60-second increments. Do not set CHKPNT_TIME to less than 60 seconds.

GLM_MEM_SEGSZ

The GLM_MEM_SEGSZ parameter specifies the segment size for each shared memory segment required for the lock manager. The default setting is 10485760. The maximum number of segments is 16. Large application environments require a large size. Each UniVerse process registers the lock names that it is locking in its per-process locking table. This table is also organized as a hashed table.

75 Appendix B: RFS configuration parameters

GRPCMT_TIME

The GRPCMT_TIME parameter specifies the group commit interval. The database records each update operation in the system buffer and in an after-image log file. The group commit interval keeps system performance from being too greatly affected by reducing the after-image log input time. The default setting is 5.

Tip: If you set the GRPCMT_TIME parameter too low, you may hurt system performance. The database allows a constant-write setting, for instance (a setting of zero), but using that setting slows performance. If system performance is poor, increase the setting. The typical range of settings is from 1 to 5 seconds. Increasing this parameter increases the risk of loss in the event of a system crash. If you want immediate writes to the after-image log file, set this parameter to zero, keeping in mind that system performance may be affected.

LOG_OVRFLO

The LOG_OVRFLO parameter specifies a directory (absolute path) for overflow from log files. If you install the database with the RFS option, the database prompts you for the location where you want to put your log files. (You may want to specify a directory on a separate file system for best system performance.) The database then creates a directory called overflow under that log directory, and sets the LOG_OVRFLO parameter accordingly. If you install the database without RFS, the database does not set a value for LOG_OVRFLO and, if you later decide to enable RFS, you must add the parameter to the uvconfig file.

Warning: Using this parameter is recommended. If you do not define LOG_OVRFLO correctly for your environment and your log files overflow unexpectedly, the database brings down your system to protect its integrity. A correlation exists between the size of your transaction and the overflow behavior of log files. Large transactions that exceed the capacity of your log files will cause overflow, but no formula yet exists for predicting such occurrences.

N_AFT

The N_AFT parameter specifies the size of the active file table (AFT), which contains an entry for every unique file and tracks the number of files that are open system-wide. This parameter has a default value of 5000. Dynamic files consume two slots and each file index consumes one slot. The AFT can become quite large when applications have many files open at once. You can configure the size of the table by adjusting the N_AFT_SECTION and N_AFT_SECTION_BUCKET configuration parameters.

N_AFT_SECTION

The N_AFT_SECTION parameter specifies the number of sections in the system-wide active file table (AFT). Dividing the AFT into multiple sections reduces wait times because it makes it possible for different uvsh/user processes to search in different sections at the same time. It also reduces the number of entries it needs to search through. The default value of this parameter is 17. You can increase the number of sections using this parameter, but ensure that the new value is greater than 0, less than N_AFT/2, and be a prime number.

76 N_AFT_SECTION_BUCKET

It is recommended that you increase this number in small increments and review the results using the uvsysmon utility.

N_AFT_SECTION_BUCKET

The N_AFT_SECTION_BUCKET parameter specifies the number of hash buckets in an active file table (AFT). The default value for this parameter is 23. This parameter must be a prime number and greater than the value of the N_AFT_SECTION parameter. The average number of files to be loaded into a hash bucket in the AFT can be calculated using the following formula: (N_AFT/N_AFT_SECTION)/N_AFT_SECTION_BUCKET. If the default uvconfig settings are used, this formula, (5000/17)/23, results in approximately 13 files per hash bucket.

N_AIMG

The N_AIMG parameter specifies the number of after-image log files in each group of such files described in the log configuration table. The default setting is 2.

Tip: When you change this parameter, you must change the log configuration table and run uvcntl_install. You can use the uvsysmon utility to determine when to change N_AIMG. If the wait rate (WaitRate) is higher than 5 percent or the polling rate (PollRate) is higher than 2 percent, consider increasing the value of N_AIMG.

For information on the log configuration table, see Logging, on page 11.

N_ARCH

The N_ARCH parameter specifies the number of archive files defined in the archive configuration table. If you change this parameter in the uvconfig file, you must run uvcntl_install. The default setting is 2.

Tip: When you change this parameter, you must change the archive configuration table.

For information on changing the archive configuration table, see Archiving, on page 19.

N_BIG

The N_BIG parameter specifies the number of block index groups (BIGs). A block index group acts as an index to the pages in the system buffer, defined by N_PUT. If N_BIG is too large, the number of semaphore operations will increase since each BIG has a semaphore control, which may increase page swapping. If N_BIG is too small, there will be a lot of contention between different processes, which will negatively impact system performance. The default is 233.

Tip: The value of N_BIG must be smaller than N_PUT. The optimum value for N_BIG is highly application-dependent. As a starting point, you may set N_BIG to the prime number nearest NUSERS * 5. N_BIG must always be a prime number.

77 Appendix B: RFS configuration parameters

N_BIMG

The N_BIMG parameter specifies the number of before-image log files in each group of such files described in the log configuration table. The default setting is 2.

Tip: When you change this parameter, you must change the log configuration file and run uvcntl_install. You can use the uvsysmon utility to determine when to change N_BIMG. If the uvsysmon wait rate (WaitRate) is higher than 5 percent or the polling rate (PollRate) is higher than 2 percent, consider increasing the value of N_BIMG.

For information on the log configuration file, see Logging, on page 11.

N_PUT

The N_PUT parameter specifies the system buffer size in pages. Each page is 1024 bytes. If the system buffer size is too small, many files may be swapped in and out of the buffer. Increasing the system buffer size may improve performance; the optimum setting is highly application-dependent. The default setting is 8192 pages. When accessing files, the database first looks for each data file block in the system buffer. If the block is not there, the database reads the block from disk, and then puts it into the system buffer, swapping other pages to disk if the system buffer is full.

Tip: If you change N_PUT to a larger number, you should also increase N_BIG.

N_SYNC

The N_SYNC parameter specifies the number of sync daemons that you want running on your system. If you notice significant performance degradation during a checkpoint, you can increase the number of sync daemons. Sync daemons periodically flush update pages from the system buffer to the log files, reducing the amount of time it takes to complete a checkpoint.

NSEM_PSET

The NSEM_PSET parameter specifies the number of semaphores per semaphore set. You should not need to change this parameter. The default setting is 8.

RFS_DUMP_DIR

The RFS_DUMP_DIR parameter specifies where the database stores the rfs.dump file when the uvs_stat -s command is executed. The default value is an empty string in UNIX, with the database storing the rfs.dump file in the uvhome directory. If the database determines the defined path is invalid when it starts, it writes the rfs.dump file to the uvhome directory, and prints a message to the uvsm.log file.

78 RFS_DUMP_HISTORY

RFS_DUMP_HISTORY

The RFS_DUMP_HISTORY parameter specifies how many rfs.dump files to preserve when you execute the uvs_stat -s command. The default value of this parameter is 1. With this value, the database creates the rfs.dump file that you specify with the RFS_DUMP_DIR parameter.

If this value is set to a positive integer, for example 4, the database names the rfs.dump files rfs.dump1, rfs.dump2, rfs.dump3, and rfs.dump4. The uvs_stat -s command uses the first available rfs.dump file. If all the rfs.dump files are full, the uvs_stat -s command reuses the oldest rfs.dump file. If this value is set to 0, the database preserves all rfs.dump files and names them rfs.dump1, rfs.dump2, and so forth.

RFS_MODE

The RFS_MODE parameter specifies the mode in which RFS will run. When set to 0, which is the default setting, RFS runs in writethrough mode. When set to 1, RFS runs in full-protection mode.

SB_PAGE_SZ

The SB_PAGE_SZ parameter specifies the size of a page in the system buffer in terms of the number of 1024 bytes. The default value is 1, meaning that each page in the system buffer is 1024 bytes. This value must be a power of 2 (1, 2, 4, 8, 16, and so on).

Note: As of UniVerse 12.1, shared memory segments are being used. To size the system buffer memory segment appropriately, you must multiply the SB_PAGE_SZ value by the N_PUT value.

For more information about the shared memory segment size, see Calculating the system buffer's shared memory segment size, on page 57.

SESSION_AFT_BUCKET

The SESSION_AFT_BUCKET parameter specifies the number of hash buckets an active file table (AFT) is divided into for each UniVerse session. These buckets can be used to improve AFT table searching when a UniVerse session has a file opened in the system buffer. You can change the number of hash buckets per session by increasing the value of this parameter. This number must be greater than 0 and should be a prime number.

SESSION_NFILES

This parameter specifies the number of logical files the application can open per session. When this limit is reached, attempts to open additional files will fail. The default value is based on a multiple of the current MFILES setting.

79 Appendix B: RFS configuration parameters

You can change the number of logical files the application can open per session by increasing the value of this parameter.

SYNC_TIME

The SYNC_TIME parameter specifies the number of seconds that the uvsyncd daemons wait before scanning the system buffer for updated pages.

80