<<

Practical Solutions for Pagination, Indention, and Mixed-Text Alignment in ODS RTF Word Hui Song, PRA International, Horsham, PA

ABSTRACT The ODS rich-text format (RTF) destination, together with PROC REPORT, makes it possible to produce high-quality, informative, statistical reports, that conform to the FDA CSR submission guidance. However, PROC REPORT misses critical features such as pagination to make customized reports. Text wrapping within the RTF table cells (or listing) makes this task hard to accomplish. Sometimes, for ease of reading, we need to add a category ended with ‘Continued’ at the top of the following pages when the category spreads across multiple pages. These kinds of requirements make the pagination even harder to achieve. On the other hand, Since SAS® no longer has full control of the RFT’s layout when using ODS RTF destination, to indent and align mixed text (such as xx.xxx (xx.x %)) also becomes more challenging. This paper presents several practical solutions to address the above mentioned issues and produce good- quality reports in RTF format.

INTRODUCTION In addition to conform to the FDA guidance, the pharmaceutical community is striving to produce high- quality and informative statistical reports in the FDA CSR submission package. This effort becomes more doable, with the introduction of ODS rich-text format (RTF) destination. With the rich set of RTF formatting commands, together with PROC REPORT, clinical programmers can produce good-quality table and listing reports. Depended on clinical studies, the statistical reports are often highly customized per statistical analysis purpose. The report needs to be paginated for easily reading and understanding. Sometimes, we need to add some extra lines of information to make the report easy to follow. For example, if the items below a category (such as preferred terms under a body organ class) spread across multiple pages, we should add the category (ended with ‘Continued’) to the following pages to make it easy for the reviewers to follow. In other times, a long text (e.g., comments) will wrap within a table cell when RTF format is used, expending to multiple lines. Since PROC REPORT itself does not provide a pagination mechanism directly, these customizing requirements make it even hard to accomplish the pagination nicely. On the other hand, with ODS RTF destination facility, SAS no longer has full control of the layout of the output. Text indention and mixed-text alignment (such xx.xxx (xx.x %)) could become very time consuming in the TFL producing process when generating the output as rich-text format (RTF) Word . In this paper, we provide several practical solutions (including useful SAS tricks and RTF formatting commands) to address the above mentioned issues and produce good-quality reports in RTF format. Sample codes will be provided and can be easily revised and used in the daily TFL production work for clinical programmers.

BASICS FOR ODS RTF STYLE AND COMMANDS

A good and concise introduction of ODS RTF destination can be found in the first reference (SAS, The RTF Destination). Here we just give a short summary. RTF specification is a method to encode formatted text and graphics. An RTF file consists of unformatted text, control words, control symbols, and groups.

A control word is a specially formatted command that RTF uses to mark control codes and information that Word application uses to manage documents. A control word takes the format of \LetterSequence. A backslash begins each control word. The LetterSequence is made up of lowercase alphabetic characters (a-z). RTF is case sensitive. Some control words (such as bold and

1 italic) have only two states. A control word with a parameter of 0 is used to turns off the property. For example, \b turns on bold, whereas \b0 turns off bold. Table 1 listed some frequently-used control words. Table 1: RTF Control Word

Style RTF Control Word Italicize \i Underline \ul New line \line Subscript \sub Superscript \super Bold \b Left aligned \ql Right aligned \qr centered \qc

A control symbol consists of a backslash followed by a single, nonalphabetic character. For example, \~ represents a nonbreaking space. Control symbols take no delimiters. A group consists of text and control words or control symbols enclosed in braces ({}). Each group specifies the text affected by the group and the different attributes of that text. The RTF file can include groups for , styles, color, pictures, footnotes, comments, headers and footers, as well as document-, -, -, and character- formatting properties. Formatting specified within a group affects only the text within that group.

Based on RTF specification, PROC TEMPLATE is used to create global styles for titles, footnotes, colors, fonts, margins, etc. In addition, one can embed RTF commands in SAS code to format the text by controlling the style, color, , and so forth. This is often referred to as In-Line Formatting. A good paper on this topic is reference (Parsons 2007). To use in-line formatting, you first define an escape character with the ODS statement: ODS ESCAPECHAR =‘escape-character’;. The ‘^’ character is a safe one and is used in Table 2, excerpted from (Parsons 2007), which is listed here for quick reference. Table 2: In-Line Formatting Code

SAS Option Function Modifies the style of the current paragraph or cell. The syntax is just like that of the style={} syntax in table templates and PROCs REPORT and TABULATE except that no style element is supported to the left of ^S={style-attributes} the {}s. A specialized case of the above; reverts the style to the style of the ^S={} paragraph. ^{super text} Put text into a superscript. ^{sub text} Put text into a subscript. ^{dagger} Print the "dagger" character. ^n Inserts a new line. Marks a good place for an optional line-break. Doesn't force a new ^w line, but "suggests" it as a good place if the line must be wrapped. ^_ Inserts a non-breaking space.

In the literature, there are several other good papers presenting the tricks and tips of using RTF commands to control the format of the text in the output, such as (Shannon 2002) and (Rachabattula 2010), just to name a few. In this paper, we will focus on the areas where the RTF formatting is more

2 challenging due to SAS losing partial of the layout control of the output, including pagination, indention, mixed-text alignment, etc.

PAGINATION IN PROC REPORT With RTF ODS facility and PROC TEMPLATE, PROC REPORT is used to produce the tables, figures, and listings (TFLs). The pagination is done automatically by the RTF applications (such as MS Word) when rendering a RTF document and displaying it. The application program will decide when and where a new will begin. In this sense, SAS has lost partial of the layout control of the RTF output. In fact, PROC REPORT itself does not have a direct option to provide correct pagination for customized reports within ODS. More specifically, when creating TFLs via the ODS, the number of lines per page can be determined in advance. However, due to text wrapping, there is no direct option to calculate the number of lines needed per observation in such situation. Text wrapping can cause elements that should be displayed together on the same page to split over two or more pages. Moreover, sometimes, we need to force starting a new page when presenting a new group of information (e.g., AEBODSYS or USUBJID). Thus, Manual pagination control is needed, to some extent, to determine where the pagination occurs. In this section, we present a simple technique that can manually control the pagination in RTF output. The sample code can be easily revised and enhanced to fit different TFL presentation requirements. The basic idea is as follows (illustrated in Table 3). First, count the number of rows per observation that each text-wrapping variable requires given the respective cell widths. Second, add the maximum rows needed for each observation and add it to the line (Step 1). Third, consider some variables that may add additional lines to the table (such as skip variables in PROC REPORT) (Step 2). Fourth, calculate the number of rows needed and apply it to a variable used by PROC REPORT to determine the desired page breaks (Step 3). Finally, a pagination variable is created for PROC REPORT (Step 4). Additional explanation can be seen in the comments in the code. Note, to save space, we sometimes put more than one statement in the same row. This practice is not encouraged though. Table 3: Pagination by Counting Lines Manually

data outDS; set inDS; by studyid usubjid SKIPVAR other_sort_variables ; retain LINECNT 0 MAXN 30; *<---adjust MAXN per the number of rows used by title and footnotes;

** Step 1: some variables (e.g., VAR1 and VAR2) may span over multiples rows, count here **; len1=1; len2=1; if VAR1 ne ' ' then len1= ceil(length((VAR1))/WID1); *<--check VAR1 in output to find its width WID1; if VAR2 ne ' ' then len2= ceil(length((VAR2))/WID2); *<--check VAR2 in output to find its width WID2; LINECNT+max(len1,len2,1); *<---more text-wrapping variables can be added here;

** Step 2: skipped lines counted here (e.g., SKIPVAR is a skip variable in PROC REPORT) **; if first.skipvar then LINECNT+1;

** Step 3 and 4: other special paging needs or when LINECNT is larger than page size, page break **; if first.usubjid or LINECNT >=MAXN then do; PAGEIT+1; LINECNT=0; end; run;

3

Within the PROC REPORT procedure, PAGEIT is used as page break ORDER or GROUP variable. This variable follows a BREAK statement as shown below:

break after pageit / page;

The sample code in Table 3 can be easily adapted or enhanced for different table presentation requirements or rewritten into a macro.

ADDING CONTINUED CATEGORIES FOR READABILITY In this section, we present a scenario where continued categories are added in following spread-across pages for readability. For example, in the table shell that lists the relevant preferred terms within a SMQ (e.g., Hemorrhages) standardized MedDRA query. Table 4: Sample Table Shell that Requires Adding Continued Categories for Readability Cohort 1 Cohort 2 Total SMQ/Sub-SMQ (N = XXX) (N = XXX) (N = XXX) Preferred Term n (%) n (%) n (%) C0 C1 C2 C3 (end of previous page of X) … Sub-SMQ 1 xx (xx.x) xx (xx.x) xx (xx.x) Sub-SMQ 2 xx (xx.x) xx (xx.x) xx (xx.x) Sub-SMQ 3 xx (xx.x) xx (xx.x) xx (xx.x) Preferred Term 1 (row=1) xx (xx.x) xx (xx.x) xx (xx.x) Preferred Term 2 (row=2) xx (xx.x) xx (xx.x) xx (xx.x) Preferred Term 3 (row=3) xx (xx.x) xx (xx.x) xx (xx.x) Preferred Term 4 (row=4) xx (xx.x) xx (xx.x) xx (xx.x)

(start of page X+1 without continued) Preferred Term 5 (row=5) xx (xx.x) xx (xx.x) xx (xx.x) Preferred Term 6 (row=6) xx (xx.x) xx (xx.x) xx (xx.x)

(expected start of page X+1 with continued) Sub-SMQ 1 (continued) (row=0.1) Sub-SMQ 2 (continued) (row=0.2) Sub-SMQ 3 (continued) (row=0.3) Preferred Term 5 (row=5) xx (xx.x) xx (xx.x) xx (xx.x) Preferred Term 6 (row=6) xx (xx.x) xx (xx.x) xx (xx.x)

The issue here is that, a preferred term is under three levels of Sub-SMQs. In the example scenario in Table 4, there are six preferred terms. However, only four of them can be displayed at the end of page X. The rest two are spread across to the next page X+1. If we do not carry over the three levels of Sub- SMQs, it will be hard for the reviewer to quickly know what Sub-SMQs they belong to. Thus, what we want is shown in the third section of Table 4, where the three levels of Sub-SMQs are carried over to page X+1, with Continued added at the end for each of them. This will greatly improve the readability of the table. To achieve this effect, we need to prepare the dataset by adding the continued categories for those pages where the first observation is belonging to a previous Sub-SMQ group. The tricky thing is that, sometimes, one does not need to carry over all the three Sub-SMQs. For example, a page may start with a preferred term in a new Sub-SMQ 2. In such a case, a Sub-SMQ 3 (Continued) row should not be added. Similarly, sometimes, we only need to carry over Sub-SMQ1, if the page starts with a new Sub- SMQ 1 preferred term.

4 For the discussion below in Table 5, we assume that the variables used for the four columns in Table 4 are C0, C1, C2, and C3, respectively, and the input dataset has a page variable derived using our strategy in the previous section. Our idea is as follows. We check the first record for each new page to see whether it is a Sub-SMQ record or a preferred term. For the latter case, we will add the Sub-SMQ categories as needed. In our example code in Table 5, we assume that the three levels of Sub-SMQ information are stored in variables aesmq1, aesmq2, and aesmq3, respectively. You may also notice that, for indention, we are using ‘^w’ instead blank space, which does not work in RTF. In Table 5, the continued category rows are generated in the first data step (dataset cont). In the second data step, they are merged back into the original dataset and sorted properly by row number. Note that the added continued-category rows have a smaller row number (as illustrated in Table 4). This way will make them appear first at the top of the pages in the RTF output. The technique we described here can be adapted to any numbers of levels. Note that, since we may need to add continued-category rows to a page, we should be careful and reserve enough number of rows before this process so that adding them will not result in the affected page to be run over to a second page. For example, if we can hold 30 lines in a page. We should only output at most 27 rows for each page (in dataset table in Table 5) before this process. Table 5: Adding Continued Categories When Needed data cont; set table; by page group row; if first.page and page ne 1 and (compress(C0,"^w") ne aesmq1 or compress(C0,"^w") ne aesmq2 or compress(C0,"^w") ne aesmq3) then do; c1=""; c2=""; c3=""; if not missing(aesmq1) and aesmq1 ne (compress(C0,"^w")) then do; row=0.1; C0=strip(aesmq1) || " (continued)"; output; end; if not missing(aesmq2) and aesmq2 ne (compress(C0,"^w")) then do; row=0.2; C0="^w^w" || strip(aesmq2) || " (continued)"; output; end; if not missing(aesmq3) and aesmq3 ne (compress(C0,"^w")) then do; row=0.3; C0="^w^w^w^w" || strip(aesmq3) || " (continued)"; output; end; end; run; data table1; merge table cont; by page group row; run;

INDENTION AND MIXED-TEXT ALIGNMENT Since ODS RTF destination takes over some of the layout control, the formatting (such as indention by space) and mixed-text alignment using SAS will not be effective anymore in RTF format. For example, if you use blank spaces to represent indention, the blanks will not show up in RTF output. As a result, the indention is not shown in the RTF output. On way to do is to use RTF control words ‘^w’ (each is one blank space) and add as many as needed to get the intended indention.

5 In the following, we will mainly discuss how to use RTF commands to achieve these goals and make high-quality reports with nice alignment and easy-to-read indention. We first need to concatenate the count and percentage in a data step (Table 6). To achieve good alignment, as one can see below, we have added one, two, and three spaces between the count and percentage, based on the percentage is 3-digit, 2-digit, and 1-digit, respectively. Table 6: Prepare Mixed-Text in a Data Step if count=0 then freqs=put(count, 3.); *<-- if count is zero; else if row=0 then freqs=put(count, 3.); *<-- week-row statistic: n; else do; *<-- concatenate count and percentage to get statistic: n (%); if count=tcount then freqs=put(count, 3.)||" ("||strip(put((count/tcount)*100, 5.0))||")"; else if count/tcount*100>=10 then freqs=put(count,3.)||" ("||strip(put((count/tcount)*100,5.1)||")"); else if count/tcount*100<10 then freqs=put(count,3.)||" ("||strip(put((count/tcount)*100,5.1)||")"); end;

Secondly, in PROC REPORT, we use RTF commands to add necessary leading space and specify alignment, as illustrated in Table 7. You can modify the decimal alignment by using RTF code within the PRETEXT= attribute. In the code below, we are trying to force decimal alignment on unruly data. It does this by specifying a style override on the column. The RTF syntax makes great use of the backslash (\), so the PROTECTSPECIALCHARS=off instruction is necessary to pass the backslash correctly. PRETEXT places the RTF code in each cell in the column. The instruction \TQDEC tells the RTF reader to set a decimal tab, and \TX550 specifies to set the decimal tab 550 twips from the left cell border. (A twip is 1/20th of a and is used in most RTF specifications.) Depending on the shape of the data and the width of the column, you may need to adjust the decimal offset from 550. Table 7: Additional Formatting and Specify Alignment in PROC REPORT define trt1 /display style(column)={just=l vjust=b cellwidth=2.9cm protectspecialchars=off pretext="\tqdec\tx550 "} style(header)={just=c} "Drug#(N=&POP1.)"; define trt2 /display style(column)={just=l vjust=b cellwidth=2.9cm protectspecialchars=off pretext="\tqdec\tx550 "} style(header)={just=c} "Placebo#(N=&POP2.)";

An example output can be seen in Table 8.

CONCLUSIONS ODS RTF destination, together with PROC REPORT, in SAS can be used to generate professional- looking, well-formatted statistical reports. However, SAS is still missing some critical features to make customized reports. In this abstract, we presented some practical solutions that help improving pagination and provide better mixed-text alignment in the RTF document. We also presented a detailed example on how to add continued categories to the beginning of pages to enhance the readability of the table summary report. The work is done using SAS 9.1.3. But the proposed method can be applied in any SAS versions. The intended audience is junior or senior SAS programmers who are interested in writing robust, efficient, and effective SAS programs to generate good-quality RTF output.

6 Table 8: Sample Output Subscale - Relationship Visit Drug Placebo Response Statistic (N=18) (N=19)

CANADA - Child Week 1 n 12 13 Responder n (%) 4 (33.3) 7 (53.8) Non-Responder n (%) 8 (66.7) 6 (46.2)

Week 6 / Early Termination n 14 16 Responder n (%) 5 (35.7) 7 (43.8) Non-Responder n (%) 9 (64.3) 9 (56.3)

USA - Parent Week 1 n 12 13 Responder n (%) 5 (41.7) 8 (61.5) Non-Responder n (%) 7 (58.3) 5 (38.5)

Week 6 / Early Termination n 14 17 Responder n (%) 4 (28.6) 7 (41.2) Non-Responder n (%) 10 (71.4) 10 (58.8)

REFERENCES SAS. “SAS Notes and Concepts for ODS: The RTF Destination.” Accessed July 6, 2013. http://support.sas.com/rnd/base/ods/templateFAQ/Template_rtf. Shannon, David. 2002. “To ODS RTF and Beyond.” Paper presented at the SAS Users Group International 27 (SUGI 27), Orlando, Florida, April 14-17. http://www2.sas.com/proceedings/sugi27/p001-27.pdf Parsons, Lori. 2007. “Enhancing RTF Output with RTF Control Words and In-Line Formatting.” Poster presented at the SAS Global Forum 2007, Orlando, Florida, April 16-19. www.nesug.org/Proceedings/nesug10/po/po40.pdf Rachabattula, Sriharsha. 2010. “Using RTF codes in ODS RTF outputs.” Paper presented at the NorthEast SAS User Group (NESUG 2010), Baltimore, MD, September 11-14. www.nesug.org/Proceedings/nesug10/po/po40.pdf

ACKNOWLEDGMENTS I would like to acknowledge the programming team at PRA for providing some of the code I presented here in this paper. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies.

CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Hui Song PRA International Inc. 630 Dresher Road Horsham, PA 19044 Work Phone: 215-444-8583 Email: [email protected] * * * * * * * * * * * ** * * * * ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

7