1 Overview
1.1 Background
In clinical trials, a critical step is to submit trial results to regulatory agencies. Electronic Common Technical Document (eCTD) has become a worldwide regulatory submission standard format. For example, the United States Food and Drug Administration (US FDA) requires new drug applications and biologics license applications must be submitted using the eCTD format. The Clinical Data Interchange Standards Consortium (CDISC) provides a pilot project following ICH E3 guidance.
Within eCTD, clinical study reports (CSRs) are located at module 5. ICH E3 guidance provides a compilation of the structure and content of clinical study reports.
A typical CSR contains full details on the methods and results of an individual clinical study. In support of the statistical analysis, a large number of tables, listings, and figures are incorporated into the main text and appendices. In the CDISC pilot project, an example CSR is also provided. If you are interested in more examples of clinical study reports, you can go to the European Medicines Agency (EMA) clinical data website.
Building CSRs is teamwork between clinicians, medical writers, statisticians, statistical programmers,
and other relevant specialists such as experts on biomarkers.
Here, we focus on the work and deliverables completed by statisticians and statistical programmers.
In an organization, they commonly work together to
define, develop, validate and deliver tables, listings, and figures (TLFs) required for a CSR to
summarize the efficacy and/or safety of the pharmaceutical product.
Microsoft Word is widely used to prepare CSR in the pharmaceutical industry.
Therefore, .rtf
, .doc
, .docx
are commonly used formats in their deliverables.
In this chapter, our focus is to illustrate how to create tables, listings, and figures (TLFs) in RTF format that is commonly used in a CSR. The examples are in compliance with the FDA’s Portable Document Format (PDF) Specifications.
FDA’s PDF specification is a general reference. Each organization can define more specific TLF format requirements that can be different from the examples in this book.
1.2 Structure and content
In the rest of this chapter, we are following the ICH E3 guidance on the structure and content of clinical study reports.
In a CSR, most of TLFs are located in
- Section 10: Study participants
- Section 11: Efficacy evaluation
- Section 12: Safety evaluation
- Section 14: Tables, listings, and figures referrals but not included in the text
- Section 16: Appendices
1.3 Datasets
We used publicly available CDISC pilot study data located in the CDISC Bitbucket repository.
For simplicity, we have downloaded all these datasets into the data-adam/
folder of this project and converted them from the .xpt
format to
the .sas7bdat
format.
The dataset structure follows CDISC Analysis Data Model (ADaM).
1.4 Tools
In this part, we mainly use the R packages below to illustrate how to deliver TLFs in a CSR.
There are other R packages to create TLFs in ASCII, RTF and Word format.
For example, rtables
, huxtable
, pharmaRTF
, gt
, officer
, flextable
etc.
Here we focus on r2rtf
to illustrate the concept.
Readers are encouraged to explore other R packages to find the proper tools to fit your purpose.
1.4.1 tidyverse
tidyverse
is a collection of R packages to simplify the workflow to manipulate,
visualize and analyze data in R.
Those R packages share
the tidy tools manifesto
and are easy to use for interactive data analysis.
RStudio provided outstanding cheatsheets
and tutorials for tidyverse
.
There are also books to introduce tidyverse
.
We assume the reader have experience in using tidyverse
in this book.
1.4.2 r2rtf
r2rtf
is an R package to create production-ready tables and figures in RTF format.
This R package is designed to
- provide simple “verb” functions that correspond to each component of a table, to help you translate a data frame to a table in an RTF file;
- enable pipes (
%>%
); - focus on the table format only.
Data manipulation and analysis shall be handled by other R packages (e.g.,
tidyverse
).
Before creating an RTF table, we need to
- figure out the table layout;
- split the layout into small tasks in the form of a computer program;
- execute the program.
We provide a brief introduction of r2rtf
and show how to transfer
data frames into table, listing, and figures (TLFs).
Other extended examples and features are covered on the
r2rtf
package website.
To explore the basic RTF generation verbs in r2rtf
,
we will use the dataset r2rtf_adae
saved in the r2rtf
package.
This dataset contains adverse events (AEs) information from a clinical trial.
We will begin by loading the packages:
library(dplyr) # Manipulate data
library(tidyr) # Manipulate data
library(r2rtf) # Reporting in RTF format
Below is the meaning of relevant variables.
More information can be found on the help page of the dataset (?r2rtf_adae
)
In this example, we consider three variables:
- USUBJID: Unique Subject Identifier
- TRTA: Actual Treatment
- AEDECOD: Dictionary-Derived Term
#> USUBJID TRTA AEDECOD
#> 1 01-701-1015 Placebo APPLICATION SITE ERYTHEMA
#> 2 01-701-1015 Placebo APPLICATION SITE PRURITUS
#> 3 01-701-1015 Placebo DIARRHOEA
#> 4 01-701-1023 Placebo ERYTHEMA
dplyr
and tidyr
packages within tidyverse
are used
for data manipulation to create a data frame
that contains all the information we want to add in an RTF table.
tbl <- r2rtf_adae %>%
count(TRTA, AEDECOD) %>%
pivot_wider(names_from = TRTA, values_from = n, values_fill = 0)
tbl %>% head(4)
#> # A tibble: 4 × 4
#> AEDECOD Placebo `Xanomeline High Dose` `Xanomeline Low Dose`
#> <chr> <int> <int> <int>
#> 1 ABDOMINAL PAIN 1 2 3
#> 2 AGITATION 2 1 2
#> 3 ALOPECIA 1 0 0
#> 4 ANXIETY 2 0 4
Now we have a dataset tbl
in preparing the final RTF table.
r2rtf
aims to provide one function for each type of table layout.
Commonly used verbs include:
-
rtf_page()
: RTF page information -
rtf_title()
: RTF title information -
rtf_colheader()
: RTF column header information -
rtf_body()
: RTF table body information -
rtf_footnote()
: RTF footnote information -
rtf_source()
: RTF data source information
All these verbs are designed to enable the usage of pipes (%>%
).
A full list of all functions can be found in the
r2rtf package function reference manual.
A minimal example below illustrates how to combine verbs using pipes to create an RTF table.
-
rtf_body()
is used to define table body layout. -
rtf_encode()
transfers table layout information into RTF syntax. -
write_rtf()
save RTF encoding into a file with file extension.rtf
head(tbl) %>%
rtf_body() %>% # Step 1 Add table attributes
rtf_encode() %>% # Step 2 Convert attributes to RTF encode
write_rtf("tlf/intro-ae1.rtf") # Step 3 Write to a .rtf file
If we want to adjust the width of each column to
provide more space to the first column,
this can be achieved by updating the col_rel_width
argument
in the rtf_body()
function.
In this example, the input of col_rel_width
is a vector
with the same length for the number of columns.
This argument defines the relative width of each column
within a pre-defined total column width.
In this example, the defined relative width is 3:2:2:2
.
Only the ratio of col_rel_width
is used.
Therefore it is equivalent to use col_rel_width = c(6, 4, 4, 4)
or col_rel_width = c(1.5, 1, 1, 1)
.
head(tbl) %>%
rtf_body(col_rel_width = c(3, 2, 2, 2)) %>%
# define relative width
rtf_encode() %>%
write_rtf("tlf/intro-ae2.rtf")
In the previous example, we found the issue of a misaligned column header.
We can fix the issue by using the rtf_colheader()
function.
In rtf_colheader()
, the colheader
argument is used to provide the content of the column header.
We use "|"
to separate the columns.
In the example below, "Adverse Events | Placebo | Xanomeline High Dose | Xanomeline Low Dose"
define a column header with 4 columns.
head(tbl) %>%
rtf_colheader(
colheader = "Adverse Events | Placebo | Xanomeline High Dose | Xanomeline Low Dose",
col_rel_width = c(3, 2, 2, 2)
) %>%
rtf_body(col_rel_width = c(3, 2, 2, 2)) %>%
rtf_encode() %>%
write_rtf("tlf/intro-ae3.rtf")
In rtf_*()
functions such as rtf_body()
, rtf_footnote()
,
the text_justification
argument is used to align text.
Default is "c"
for center justification.
To vary text justification by column, use character vector with length of vector equals to
number of columns displayed (e.g., c("c", "l", "r")
).
All possible inputs can be found in the table below.
r2rtf:::justification()
#> type name rtf_code_text rtf_code_row
#> 1 l left \\ql \\trql
#> 2 c center \\qc \\trqc
#> 3 r right \\qr \\trqr
#> 4 d decimal \\qj
#> 5 j justified \\qj
Below is an example to make the first column left-aligned and center-aligned for the rest.
head(tbl) %>%
rtf_body(text_justification = c("l", "c", "c", "c")) %>%
rtf_encode() %>%
write_rtf("tlf/intro-ae5.rtf")
In rtf_*()
functions such as rtf_body()
, rtf_footnote()
, etc.,
border_left
, border_right
, border_top
, and border_bottom
control cell borders.
If we want to remove the top border of "Adverse Events"
in the header,
we can change the default value "single"
to ""
in the border_top
argument, as shown below.
r2rtf
supports 26 different border types. The details can be found on
the r2rtf package website.
In this example, we also demonstrate the possibility of adding multiple column headers.
head(tbl) %>%
rtf_colheader(
colheader = " | Treatment",
col_rel_width = c(3, 6)
) %>%
rtf_colheader(
colheader = "Adverse Events | Placebo | Xanomeline High Dose | Xanomeline Low Dose",
border_top = c("", "single", "single", "single"),
col_rel_width = c(3, 2, 2, 2)
) %>%
rtf_body(col_rel_width = c(3, 2, 2, 2)) %>%
rtf_encode() %>%
write_rtf("tlf/intro-ae7.rtf")
In the r2rtf
R package get started page,
there are more examples to illustrate how to customize
- title, subtitle
- footnote, data source
- special character
- etc.
Those features will be introduced when we first use them in the rest of the chapters.