1 Overview

1.1 Background

In clinical trials, a critical step is to submit trial results to regulatory agencies. Electronic Common Technical Document (eCTD) has become a worldwide regulatory submission standard format. For example, the United States Food and Drug Administration (US FDA) requires new drug applications and biologics license applications must be submitted using the eCTD format. The Clinical Data Interchange Standards Consortium (CDISC) provides a pilot project following ICH E3 guidance.

Within eCTD, clinical study reports (CSRs) are located at module 5. ICH E3 guidance provides a compilation of the structure and content of clinical study reports.

A typical CSR contains full details on the methods and results of an individual clinical study. In support of the statistical analysis, a large number of tables, listings, and figures are incorporated into the main text and appendices. In the CDISC pilot project, an example CSR is also provided. If you are interested in more examples of clinical study reports, you can go to the European Medicines Agency (EMA) clinical data website.

Building CSRs is teamwork between clinicians, medical writers, statisticians, statistical programmers, and other relevant specialists such as experts on biomarkers. Here, we focus on the work and deliverables completed by statisticians and statistical programmers. In an organization, they commonly work together to define, develop, validate and deliver tables, listings, and figures (TLFs) required for a CSR to summarize the efficacy and/or safety of the pharmaceutical product. Microsoft Word is widely used to prepare CSR in the pharmaceutical industry. Therefore, .rtf, .doc, .docx are commonly used formats in their deliverables.

In this chapter, our focus is to illustrate how to create tables, listings, and figures (TLFs) in RTF format that is commonly used in a CSR. The examples are in compliance with the FDA’s Portable Document Format (PDF) Specifications.

FDA’s PDF specification is a general reference. Each organization can define more specific TLF format requirements that can be different from the examples in this book.

1.2 Structure and content

In the rest of this chapter, we are following the ICH E3 guidance on the structure and content of clinical study reports.

In a CSR, most of TLFs are located in

  • Section 10: Study participants
  • Section 11: Efficacy evaluation
  • Section 12: Safety evaluation
  • Section 14: Tables, listings, and figures referrals but not included in the text
  • Section 16: Appendices

1.3 Datasets

We used publicly available CDISC pilot study data located in the CDISC Bitbucket repository.

For simplicity, we have downloaded all these datasets into the data-adam/ folder of this project and converted them from the .xpt format to the .sas7bdat format.

The dataset structure follows CDISC Analysis Data Model (ADaM).

1.4 Tools

In this part, we mainly use the R packages below to illustrate how to deliver TLFs in a CSR.

  • tidyverse: prepare datasets ready for reporting.
  • r2rtf: create RTF outputs

There are other R packages to create TLFs in ASCIII, RTF and Word format. For example, rtables, huxtable, pharmaRTF, gt, officer, flextable etc. Here we focus on r2rtf to illustrate the concept.
Reader is encourage to explore other R packages to find the proper tools to fit your purpose.

1.4.1 tidyverse

tidyverse is a collection of R packages to simplify the workflow to manipulate, visualize and analyze data in R. Those R packages share the tidy tools manifesto and are easy to use for interactive data analysis.

RStudio provided outstanding cheatsheets and tutorials for tidyverse.

There are also books to introduce tidyverse. We assume the reader have experience in using tidyverse in this book.

1.4.2 r2rtf

r2rtf is an R package to create production-ready tables and figures in RTF format. This R package is designed to

  • provide simple “verb” functions that correspond to each component of a table, to help you translate a data frame to a table in an RTF file;
  • enable pipes (%>%);
  • focus on the table format only. Data manipulation and analysis shall be handled by other R packages (e.g., tidyverse).

Before creating an RTF table, we need to

  • figure out the table layout;
  • split the layout into small tasks in the form of a computer program;
  • execute the program.

We provide a brief introduction of r2rtf and show how to transfer data frames into table, listing, and figures (TLFs).

Other extended examples and features are covered on the r2rtf package website.

To explore the basic RTF generation verbs in r2rtf, we will use the dataset r2rtf_adae saved in the r2rtf package. This dataset contains adverse events (AEs) information from a clinical trial.

We will begin by loading the packages:

library(dplyr) # Manipulate data
library(tidyr) # Manipulate data
library(r2rtf) # Reporting in RTF format

Below is the meaning of relevant variables. More information can be found on the help page of the dataset (?r2rtf_adae)

In this example, we consider three variables:

  • USUBJID: Unique Subject Identifier
  • TRTA: Actual Treatment
  • AEDECOD: Dictionary-Derived Term
r2rtf_adae %>%
  select(USUBJID, TRTA, AEDECOD) %>%
  head(4)
#>       USUBJID    TRTA                   AEDECOD
#> 1 01-701-1015 Placebo APPLICATION SITE ERYTHEMA
#> 2 01-701-1015 Placebo APPLICATION SITE PRURITUS
#> 3 01-701-1015 Placebo                 DIARRHOEA
#> 4 01-701-1023 Placebo                  ERYTHEMA

dplyr and tidyr packages within tidyverse are used for data manipulation to create a data frame that contains all the information we want to add in an RTF table.

tbl <- r2rtf_adae %>%
  count(TRTA, AEDECOD) %>%
  pivot_wider(names_from = TRTA, values_from = n, values_fill = 0)

tbl %>% head(4)
#> # A tibble: 4 × 4
#>   AEDECOD        Placebo `Xanomeline High Dose` `Xanomeline Low Dose`
#>   <chr>            <int>                  <int>                 <int>
#> 1 ABDOMINAL PAIN       1                      2                     3
#> 2 AGITATION            2                      1                     2
#> 3 ALOPECIA             1                      0                     0
#> 4 ANXIETY              2                      0                     4

Now we have a dataset tbl in preparing the final RTF table.

r2rtf aims to provide one function for each type of table layout. Commonly used verbs include:

All these verbs are designed to enable the usage of pipes (%>%). A full list of all functions can be found in the r2rtf package function reference manual.

A minimal example below illustrates how to combine verbs using pipes to create an RTF table.

  • rtf_body() is used to define table body layout.
  • rtf_encode() transfers table layout information into RTF syntax.
  • write_rtf() save RTF encoding into a file with file extension .rtf
head(tbl) %>%
  rtf_body() %>% # Step 1 Add table  attributes
  rtf_encode() %>% # Step 2 Convert attributes to RTF encode
  write_rtf("tlf/intro-ae1.rtf") # Step 3 Write to a .rtf file

If we want to adjust the width of each column to provide more space to the first column, this can be achieved by updating the col_rel_width argument in the rtf_body() function.

In this example, the input of col_rel_width is a vector with the same length for the number of columns. This argument defines the relative width of each column within a pre-defined total column width.

In this example, the defined relative width is 3:2:2:2. Only the ratio of col_rel_width is used. Therefore it is equivalent to use col_rel_width = c(6, 4, 4, 4) or col_rel_width = c(1.5, 1, 1, 1).

head(tbl) %>%
  rtf_body(col_rel_width = c(3, 2, 2, 2)) %>%
  # define relative width
  rtf_encode() %>%
  write_rtf("tlf/intro-ae2.rtf")

In the previous example, we found the issue of a misaligned column header. We can fix the issue by using the rtf_colheader() function.

In rtf_colheader(), the colheader argument is used to provide the content of the column header. We use "|" to separate the columns.

In the example below, "Adverse Events | Placebo | Xanomeline High Dose | Xanomeline Low Dose" define a column header with 4 columns.

head(tbl) %>%
  rtf_colheader(
    colheader = "Adverse Events | Placebo | Xanomeline High Dose | Xanomeline Low Dose",
    col_rel_width = c(3, 2, 2, 2)
  ) %>%
  rtf_body(col_rel_width = c(3, 2, 2, 2)) %>%
  rtf_encode() %>%
  write_rtf("tlf/intro-ae3.rtf")

In rtf_*() functions such as rtf_body(), rtf_footnote(), the text_justification argument is used to align text. Default is "c" for center justification. To vary text justification by column, use character vector with length of vector equal to number of columns displayed (e.g., c("c", "l", "r")).

All possible inputs can be found in the table below.

r2rtf:::justification()
#>   type      name rtf_code_text rtf_code_row
#> 1    l      left          \\ql       \\trql
#> 2    c    center          \\qc       \\trqc
#> 3    r     right          \\qr       \\trqr
#> 4    d   decimal          \\qj             
#> 5    j justified          \\qj

Below is an example to make the first column left-aligned and center-aligned for the rest.

head(tbl) %>%
  rtf_body(text_justification = c("l", "c", "c", "c")) %>%
  rtf_encode() %>%
  write_rtf("tlf/intro-ae5.rtf")

In rtf_*() functions such as rtf_body(), rtf_footnote(), etc., border_left, border_right, border_top, and border_bottom control cell borders.

If we want to remove the top border of "Adverse Events" in the header, we can change the default value "single" to "" in the border_top argument, as shown below.

r2rtf supports 26 different border types. The details can be found on the r2rtf package website.

In this example, we also demonstrate the possibility of adding multiple column headers.

head(tbl) %>%
  rtf_colheader(
    colheader = " | Treatment",
    col_rel_width = c(3, 6)
  ) %>%
  rtf_colheader(
    colheader = "Adverse Events | Placebo | Xanomeline High Dose | Xanomeline Low Dose",
    border_top = c("", "single", "single", "single"),
    col_rel_width = c(3, 2, 2, 2)
  ) %>%
  rtf_body(col_rel_width = c(3, 2, 2, 2)) %>%
  rtf_encode() %>%
  write_rtf("tlf/intro-ae7.rtf")

In the r2rtf R package get started page, there are more examples to illustrate how to customize

  • title, subtitle
  • footnote, data source
  • special character
  • etc.

Those features will be introduced when we first use them in the rest of the chapters.