4 Baseline characteristics
Following ICH E3 guidance, we need to summarize critical demographic and baseline characteristics of the participants in Section 11.2, Demographic and Other Baseline Characteristics.
In this chapter, we illustrate how to create a simplified baseline characteristics table for a study.
There are many R packages that can efficiently summarize baseline information. The table1 R package is one of them.
As in previous chapters, we first read the adsl
dataset that contains all the required information for the baseline characteristics table.
adsl <- read_sas("data-adam/adsl.sas7bdat")
For simplicity, we only analyze SEX
, AGE
and, RACE
in this example using the table1 R package. More details of the table1 R package can be found in the package vignettes.
The table1 R package directly creates an HTML report.
ana <- adsl %>%
mutate(
SEX = factor(SEX, c("F", "M"), c("Female", "Male")),
RACE = toTitleCase(tolower(RACE))
)
tbl <- table1(~ SEX + AGE + RACE | TRT01P, data = ana)
tbl
Placebo (N=86) |
Xanomeline High Dose (N=84) |
Xanomeline Low Dose (N=84) |
Overall (N=254) |
|
---|---|---|---|---|
SEX | ||||
Female | 53 (61.6%) | 40 (47.6%) | 50 (59.5%) | 143 (56.3%) |
Male | 33 (38.4%) | 44 (52.4%) | 34 (40.5%) | 111 (43.7%) |
Age | ||||
Mean (SD) | 75.2 (8.59) | 74.4 (7.89) | 75.7 (8.29) | 75.1 (8.25) |
Median [Min, Max] | 76.0 [52.0, 89.0] | 76.0 [56.0, 88.0] | 77.5 [51.0, 88.0] | 77.0 [51.0, 89.0] |
RACE | ||||
Black or African American | 8 (9.3%) | 9 (10.7%) | 6 (7.1%) | 23 (9.1%) |
White | 78 (90.7%) | 74 (88.1%) | 78 (92.9%) | 230 (90.6%) |
American Indian or Alaska Native | 0 (0%) | 1 (1.2%) | 0 (0%) | 1 (0.4%) |
The code below transfer the output into a dataframe that only contains ASCII characters recommended by regulatory agencies. tbl_base
is used as input for r2rtf to create the final report.
tbl_base <- tbl %>%
as.data.frame() %>%
as_tibble() %>%
mutate(across(
everything(),
~ str_replace_all(.x, intToUtf8(160), " ")
))
names(tbl_base) <- str_replace_all(names(tbl_base), intToUtf8(160), " ")
tbl_base
#> # A tibble: 11 × 5
#> ` ` Placebo `Xanomeline High Dose` `Xanomeline Low Dose` Overall
#> <chr> <chr> <chr> <chr> <chr>
#> 1 "" "(N=86)" "(N=84)" "(N=84)" "(N=254)"
#> 2 "SEX" "" "" "" ""
#> 3 " Female" "53 (61.6%)" "40 (47.6%)" "50 (59.5%)" "143 (56…
#> 4 " Male" "33 (38.4%)" "44 (52.4%)" "34 (40.5%)" "111 (43…
#> # ℹ 7 more rows
We define the format of the output. We highlight items that are not discussed in previous discussion.
text_indent_first
and text_indent_left
are used to control the indent space of text. They are helpful when you need to control the white space of a long phrase, “AMERICAN INDIAN OR ALASKA NATIVE” in the table provides an example.
colheader1 <- paste(names(tbl_base), collapse = "|")
colheader2 <- paste(tbl_base[1, ], collapse = "|")
rel_width <- c(2.5, rep(1, 4))
tbl_base[-1, ] %>%
rtf_title(
"Baseline Characteristics of Participants",
"(All Participants Randomized)"
) %>%
rtf_colheader(colheader1,
col_rel_width = rel_width
) %>%
rtf_colheader(colheader2,
border_top = "",
col_rel_width = rel_width
) %>%
rtf_body(
col_rel_width = rel_width,
text_justification = c("l", rep("c", 4)),
text_indent_first = -240,
text_indent_left = 180
) %>%
rtf_encode() %>%
write_rtf("tlf/tlf_base.rtf")
In conclusion, the procedure to generate demographic and baseline characteristics table is summarized as follows:
- Step 1: Read the data set.
- Step 2: Use
table1::table1()
to get the baseline characteristics table. - Step 3: Transfer the output from Step 2 into a data frame that only contains ASCII characters.
- Step 4: Define the format of the RTF table by using the R package r2rtf.