Introduction

In a 2-arm parallel randomized controlled trial (RCT) an outcome of interest is measured at least twice, once before the treatment is administered and once after. However, to measure the stability of an effect over time, additional time points can be added after the first follow-up.

In this presentation, I consider the case where an outcome of interest is measured at a baseline, a second follow-up, and a third follow-up (referred to hereafter as time 1, time 2, and time 3). In this design, it might be of interest to know the treatment effect at the time 3 that is unconditional on the treatment effect at time 2. At present, we derive such an effect and its standard error and apply the theory to a simulated outcome that is correlated over time.

This presentation uses the R programming language and assumes the end user is taking advantage of RStudio IDE to compile their R markdown files into HTML (R Core Team 2019; RStudio Team 2016). All of the files needed to reproduce these results can be downloaded from the Git repository git clone https://git.waderstats.com/two_arm_parallel_ame_time/.

Required Libraries

The libraries’ knitr, bookdown, kableExtra, and DT are used to generate the HTML output (Xie 2019, 2018; Zhu 2019; Xie, Cheng, and Tan 2019). The libraries reshape2 and dplyr are loaded for their data manipulation funtions (Wickham 2007; Wickham et al. 2019). The mvtnorm library is loaded to simulate outcomes that are correlated over time (Genz and Bretz 2009; Genz et al. 2019). The geepack library is used to estimate parameters of the linear model using the method of generalized estimating equations (Halekoh, Højsgaard, and Yan 2006; Yan and Fine 2004; Yan 2002).

package_loader <- function(x, ...) {
  if (x %in% rownames(installed.packages()) == FALSE) install.packages(x)
  library(x, ...)
}

packages <- c("knitr", "bookdown", "kableExtra", "reshape2", "dplyr", "DT", "mvtnorm", "geepack")

invisible(sapply(X = packages, FUN = package_loader, character.only = TRUE))

Statistical Methods

Notation

The notation used in this presentation is listed below.

  1. For observation \(i\) at time \(t\), let \(y_{it}\) be one of up to \(n\) outcome random variables, with each \(y_{i}\) identically distributed.
  2. Let \(c_{it}\) be an indicator for treatment condition, \(t^{(2)}_{it}\) be an indicator for time 2, \(t^{(3)}_{it}\) be an indicator for time 3.
  3. Let \(\boldsymbol{\gamma}^{(k)}_{it}\) be a set of up to \(k\) additional covariates, possibly including dummy coded categorical variables, such that \(k < n - 6\).
  4. Let \(\mathbf{z}_{it}\) be the set of all predictors.
  5. Let \(\mathbf{x}_{it} = \begin{bmatrix} 1 & c_{it} & t^{(2)}_{it} & t^{(3)}_{it} & c_{it}t^{(2)}_{it} & c_{it}t^{(3)}_{it} & \boldsymbol{\gamma}^{(k)}_{it} \end{bmatrix}\) be a row-vector of fixed covariate values.
  6. Let \(\nabla\) indicate the vector differential operator.
  7. Let \(\backslash\) be the set difference operator. That is to say, \(A \backslash B = \{x | x \in A \land x \notin B\}\)

Linear Expectation

Under the identity link function, the expected population-averaged response is modeled as a linear function of the predictors \(\mu_{it} = E[y_{it}] = \beta_0 + \beta_1c_{it} + \beta_2t^{(2)}_{it} + \beta_3t^{(3)}_{it} + \beta_4t^{(2)}_{it}c_{it} + \beta_5t^{(3)}_{it}c_{it} + \beta_6\gamma^{(1)}_{it} + ... + \beta_k\gamma^{(k)}_{it} = \mathbf{x}_{it}\boldsymbol{\beta}\) where \(\boldsymbol{\beta} = \begin{bmatrix} \beta_1 & \beta_2 & \beta_3 & \beta_4 & \beta_5 & \beta_6 & \cdots & \beta_k \end{bmatrix}^{T}\).

Treatment Effect

If we assume that the causal effects are identifiable, we can derive the marginal treatment effect for subject \(i\) at time \(t\) shown in equation (1) (Hernán and Robins 2006).

\[\begin{equation} \begin{split} y_{d_{it}} = E[y | c_{it} = 1, \mathbf{z}_{it} \backslash c_{it}] - E[y |c_{it} = 0, \mathbf{z}_{it} \backslash c_{it}] = \\ \beta_0 + \beta_1 + \beta_2t^{(2)}_{it} + \beta_3t^{(3)}_{it} + \beta_4t^{(2)}_{it} + \beta_5t^{(3)}_{it} + \beta_6\gamma^{(1)}_{it} + ... + \beta_k\gamma^{(k)}_{it} \\ - (\beta_0 + \beta_2t^{(2)}_{it} + \beta_3t^{(3)}_{it} + \beta_6\gamma^{(1)}_{it} + ... + \beta_k\gamma^{(k)}_{it}) = \\ \beta_1 + \beta_4t^{(2)}_{it} + \beta_5t^{(3)}_{it} \end{split} \tag{1} \end{equation}\]

Treating \(t^{(2)}_{it}\) as random in the expectation, applying the law of total expectation, and since \(t^{(2)}_{it}\) partitions the sample space, we can calculate the treatment effect that is unconditional on time 2 as shown in equation (2) (Bain and Engelhardt 1992).

\[\begin{equation} \begin{split} E_{t_{it}^{(2)}}[E[y_d|t^{(2)}_{it}, t^{(3)}_{it}]] = \\ E[y_d|t^{(3)}_{it}] = \\ E[y_d|t^{(2)}_{it} = 0, t^{(3)}_{it}]Pr[t^{(2)}_{it} = 0] + E[y_d|t^{(2)}_{it} = 1, t^{(3)}_{it}]Pr[t^{(2)}_{it} = 1] = \\ (\beta_1 + \beta_5t^{(3)}_{it})(1 - Pr[t^{(2)} = 1]) + (\beta_1 + \beta_4 + \beta_5t^{(3)}_{it})Pr[t^{(2)} = 1] = \\ \beta_1 - \beta_1Pr[t^{(2)}_{it} = 1] + \beta_5t^{(3)}_{it} - \beta_5t^{(3)}_{it}Pr[t^{(2)}_{it} = 1] + \beta_1Pr[t^{(2)}_{it} = 1] + \beta_4Pr[t^{(2)}_{it} = 1] + \beta_5t^{(3)}_{it}Pr[t^{(2)}_{it} = 1] = \\ \beta_1 + \beta_5t^{(3)}_{it} + \beta_1Pr[t^{(2)}_{it} = 1] \end{split} \tag{2} \end{equation}\]

Therefore the marginal treatment effect at time 3 that is unconditional on the treatment effect at time 2 is \(E[y_d|t^{(3)}_{it} = 1] = \beta_1 + \beta_5 + \beta_1Pr[t^{(2)}_{it} = 1]\). Let \(\boldsymbol{\hat{\beta}}\) be a consistent estimator of \(\boldsymbol{\beta}\), then the corresponding estimate of the population parameter is \(\hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\hat{Pr}[t^{(2)}_{it} = 1]\).

Approximate Standard Error of the Treatment Effect Estimate

The predicted values of \(y_{it}\) equal to \(\hat{\mu}_{it}\) is a function of the predictors \(\mathbf{x}_{it}\) and \(\boldsymbol{\hat{\beta}}\), call it \(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}})\). Let \(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}})) = \hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\bar{t}\) be the function that transforms the predicted values to the treatment effects at time 3 that are unconditional on time 2.

Let \(\Sigma\) be the variance-covariance of \(\boldsymbol{\beta}\). The Delta Method is a generalization of the central limit theorem that allows us to find the approximate sampling variance of a scalar-valued function of our estimators (Doob 1935; Dowd, Greene, and Norton 2014). As seen above, \(g\) is such a function. Therefore by the Delta Method we have \(Var(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}}))) \approx \nabla g(f(\mathbf{x}_{it}, \boldsymbol{\beta}))^T \cdot \Sigma \cdot \nabla g(f(\mathbf{x}_{it}, \boldsymbol{\beta}))\). If we replace the population parameters with their consistent estimators and take the square root, we have the approximate standard error of the marginal treatment effect that is unconditional on time 2. Equation (3) shows the derivation of the gradient, and equation (4) shows the approximate variance of the estimator, followed by its standard error in equation (5).

\[\begin{equation} \begin{split} \nabla g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}})) = \begin{bmatrix} \frac{\partial (\hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\bar{t})}{\partial \hat{\beta_0}} \\ \frac{\partial (\hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\bar{t})}{\partial \hat{\beta_1}} \\ \vdots \\ \frac{\partial (\hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\bar{t})}{\partial \hat{\beta_k}} \end{bmatrix} = \begin{bmatrix} 0 \\ 1 + \bar{t} \\ 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} \end{split} \tag{3} \end{equation}\]

\[\begin{equation} \begin{split} Var(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}}))) \approx \begin{bmatrix} 0 & 1 + \bar{t} & 0 & 0 & 0 & 1 & 0 & \cdots & 0 \end{bmatrix} \cdot \hat{\Sigma} \cdot \begin{bmatrix} 0 \\1 + \bar{t} \\ 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} \end{split} \tag{4} \end{equation}\]

\[\begin{equation} \begin{split} SE(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}}))) = \sqrt{Var(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}})))} \end{split} \tag{5} \end{equation}\]

Simulation

Data Generating Process

The data generating process will assume that the mean response for observation \(i\) at time \(t\) is modeled as a function of \(c_{it}\), \(t^{(2)}_{it}\), and \(t^{(3)}_{it}\) with three additional covariates \(\gamma^{(1)}_{it}\), \(\gamma^{(2)}_{it}\), and \(\gamma^{(3)}_{it}\) (Equation (6)). Therefore the conditional expectation of observation \(i\) at times 1, 2, and 3 are as seen in equations (7), (8), and (9), respectively.

\[\begin{equation} \begin{split} \mu_{it} = E[y_{it}] = \\ \beta_0 + \beta_1c_{it} + \beta_2t^{(2)}_{it} + \beta_3t^{(3)}_{it} + \beta_4t^{(2)}_{it}c_{it} + \beta_5t^{(3)}_{it}c_{it} + \beta_6\gamma^{(1)}_{it} + \beta_7\gamma^{(2)}_{it} + \beta_8\gamma^{(3)}_{it} \end{split} \tag{6} \end{equation}\]

\[\begin{equation} \begin{split} E[y|t^{(2)}_{it} = 0, t^{(3)}_{it} = 0] = \\ \beta_0 + \beta_1c_{it} + \beta_6\gamma^{(1)}_{it} + \beta_7\gamma^{(2)}_{it} + \beta_8\gamma^{(3)}_{it} \end{split} \tag{7} \end{equation}\]

\[\begin{equation} \begin{split} E[y|t^{(2)}_{it} = 1, t^{(3)}_{it} = 0] = \\ \beta_0 + \beta_1c_{it} + \beta_2 + \beta_4c_{it} + \beta_6\gamma^{(1)}_{it} + \beta_7\gamma^{(2)}_{it} + \beta_8\gamma^{(3)}_{it} \end{split} \tag{8} \end{equation}\]

\[\begin{equation} \begin{split} E[y|t^{(2)}_{it} = 0, t^{(3)}_{it} = 1] = \\ \beta_0 + \beta_1c_{it} + \beta_3 + \beta_5c_{it} + \beta_6\gamma^{(1)}_{it} + \beta_7\gamma^{(2)}_{it} + \beta_8\gamma^{(3)}_{it} \end{split} \tag{9} \end{equation}\]

Equations (7), (8), and (9) also gives us the marginal treatment effect at each time.

  1. The marginal treatment effect at time 1 is \(\beta_1\).
  2. The marginal treatment effect at time 2 is \(\beta_1 + \beta_4\).
  3. The marginal treatment effect at time 3 is \(\beta_1 + \beta_5\).

Furthermore, we will assume that the response over time is generated from a multivariate normal distribution, shown in Equation (10).

\[\begin{equation} \begin{split} y_i \sim N_3 \Bigg( \begin{bmatrix} E[y|t^{(2)}_{it} = 0, t^{(3)}_{it} = 0, \gamma^{(1)}_{it}, \gamma^{(2)}_{it}, \gamma^{(3)}_{it}] \\ E[y|t^{(2)}_{it} = 1, t^{(3)}_{it} = 0, \gamma^{(1)}_{it}, \gamma^{(2)}_{it}, \gamma^{(3)}_{it}] \\ E[y|t^{(2)}_{it} = 9, t^{(3)}_{it} = 1, \gamma^{(1)}_{it}, \gamma^{(2)}_{it}, \gamma^{(3)}_{it}] \\ \end{bmatrix}, \begin{bmatrix} var(y_{i1}) & cov(y_{i1}, y_{i2}) & cov(y_{i1}, y_{i4}) \\ cov(y_{i2}, y_{i1}) & var(y_{i2}) & cov(y_{i2}, y_{i3}) \\ cov(y_{i3}, y_{i1}) & cov(y_{i3}, y_{i2}) & var(y_{i3}) \end{bmatrix} \Bigg) \end{split} \tag{10} \end{equation}\]

For this simulation, we will assume the following population parameters:

  1. The intercept term is 5.
    • \(\beta_0 = 5\)
  2. The marginal treatment effect at time 1 is 0.
    • \(\beta_1 = 1\)
  3. The marginal treatment effect at time 2 is 1.
    • \(\beta_1 + \beta_4 = 2\), which implies \(\beta_4 = 1\)
  4. The marginal treatment effect at time 3 is 2.
    • \(\beta_1 + \beta_5 = 3\), which implies \(\beta_5 = 2\)
  5. For fixed values of all other predictors, the effect at time 2 is 3.
    • \(\beta_2 = 3\)
  6. For fixed values of all other predictors, the effect at time 3 is 2.
    • \(\beta_3 = 2\)
  7. The partial effects of all other predictors (or first discrete differences) on the outcome are 6, 5, and 8.
    • \(\beta_6 = 6\), \(\beta_7 = 5\), \(\beta_8 = 4\)

Under these assumptions, and assuming every observation is measured at every time point, the population parameter for the treatment effect that is unconditional on time 2 is \(\beta_1 + \beta_5 + \beta_1 \bar{t}^{(2)} = 1 + 2 + 1/3 = 3.33\).

Simulated Data

The simulation data is created using the data generating process described above.

# Set the seed number so that the simulation is reproducible.
set.seed(123)

# Specify the number of independent observations.
n = 100

# Generate the condition assignment as well as the additional confounders,
C = rbinom(n, 1, prob = 1/2)
gamma1 = rnorm(n, mean = 0, sd = 1)
gamma2 = rnorm(n, mean = 0, sd = 1)
gamma3 = rnorm(n, mean = 0, sd = 1)

# The population parameters are used in specifying conditional expectation for the outcome.
beta0 = 5; beta1 = 1; beta2 = 3; beta3 = 2; beta4 = 1; beta5 = 2; beta6 = 6; beta7 = 5; beta8 = 4

# The outcome at each time point depends on the vectors of predictors and the parameter values.
mu_t1 <- beta0 + beta1*C + beta6*gamma1 + beta7*gamma2 + beta8*gamma3
mu_t2 <- beta0 + beta1*C + beta2 + beta4*C + beta6*gamma1 + beta7*gamma2 + beta8*gamma3
mu_t3 <- beta0 + beta1*C + beta3 + beta5*C + beta6*gamma1 + beta7*gamma2 + beta8*gamma3

# The expected value of the outcome at each time point.
mu <- cbind(mu_t1, mu_t2, mu_t3)

# Covariance/correlation matrix for the outcomes measured on the same observation over time.
sim_sigma = diag(3)

rownames(sim_sigma) <- c("Time1", "Time2", "Time3")
colnames(sim_sigma) <- c("Time1", "Time2", "Time3")

sim_sigma["Time1", "Time2"] <- 0.5
sim_sigma["Time2", "Time1"] <- 0.5

sim_sigma["Time1", "Time3"] <- 0.25
sim_sigma["Time3", "Time1"] <- 0.25

sim_sigma["Time2", "Time3"] <- 0.5
sim_sigma["Time3", "Time2"] <- 0.5

# Generate multivariate normal random variables using the mean vectors and corresponding correlation of the outcomes over time.
outcomeFun <- function(x) {
  res <- rmvnorm(1, x, sigma = sim_sigma)
  return(res)
}

y <- t(apply(mu, 1, outcomeFun))
y <- melt(y)

# The simulated data set.
d <- cbind(y, rep(C, 3), rep(gamma1, 3), rep(gamma2, 3), rep(gamma3, 3))

colnames(d) <- c("id", "Time", "Y", "C", "gamma1", "gamma2", "gamma3")

d$Time <- factor(d$Time, levels = c(1,2,3))

d <- d[order(d$id, d$Time), ]

rownames(d) <- NULL

d_DT <- datatable(d, escape = FALSE,
                  extensions = c('Buttons', 'KeyTable'),
                  class = 'cell-border stripe',
                  rownames = TRUE,
                  options = list(
                    dom = 'Bfrtip',
                    pageLength = 5,
                    deferRender = TRUE,
                    responsive = TRUE,
                    scrollX = TRUE,
                    scrollCollaspe = TRUE,
                    paging = TRUE,
                    autoWidth = FALSE,
                    keys = TRUE,
                    buttons = c('copy', 'csv', 'excel', 'pdf', 'print')
                  ))

d_DT %>% formatRound(c("Y", "gamma1", "gamma2", "gamma3"), 2)

Estimation

For estimating \(\boldsymbol{\beta} = \hat{\boldsymbol{\beta}}\) we will use the method of longitudinal data analysis described in Liang and Zeger (1986) with robust variance estimates of \(\hat{\boldsymbol{\beta}}\) calculated using the method described in White (1980). Following this method, we see that the estimates are close to their respective population parameters: \(\beta_0 = 5\), \(\beta_1 = 1\), \(\beta_2 = 3\), \(\beta_3 = 2\), \(\beta_4 = 1\), \(\beta_5 = 2\), \(\beta_6 = 6\), \(\beta_7 = 5\), and \(\beta_8 = 4\).

fit <- geeglm(Y ~ C + Time + C*Time + gamma1 + gamma2 + gamma3, data = d, id = id, family = gaussian, corstr = "ar1")

coef(fit)[c("(Intercept)", "C", "Time2", "Time3", "C:Time2", "C:Time3", "gamma1", "gamma2", "gamma3")]
## (Intercept)           C       Time2       Time3     C:Time2     C:Time3 
##   4.8815462   1.3985718   2.9294477   2.0177424   0.7593363   1.6677978 
##      gamma1      gamma2      gamma3 
##   6.0984678   4.9565371   4.1170882

From the estimated coefficients, we can calculate the estimated marginal treatment effect that unconditional on time 2.

uncond_t3_effect <- coef(fit)["C"] + coef(fit)["C:Time3"] + coef(fit)["C"]*(1/3)
names(uncond_t3_effect) <- NULL
uncond_t3_effect
## [1] 3.53256

Standard Error of the Estimate

For the simulation, the gradient of function that transforms the linear predictor to the marginal treatment effect that is unconditional on time 2 is as seen in equation (11). The approximated variance of the sampling distribution of the estimate is shown in equation (12), followed by its standard error in equation (13).

\[\begin{equation} \begin{split} \nabla g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}})) = \begin{bmatrix} \frac{\partial (\hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\bar{t})}{\partial \hat{\beta_0}} \\ \frac{\partial (\hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\bar{t})}{\partial \hat{\beta_1}} \\ \vdots \\ \frac{\partial (\hat{\beta}_1 + \hat{\beta}_5 + \hat{\beta}_1\bar{t})}{\partial \hat{\beta_8}} \end{bmatrix} = \begin{bmatrix} 0 \\ 1 + \bar{t} \\ 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} \end{split} \tag{11} \end{equation}\]

\[\begin{equation} \begin{split} Var(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}}))) \approx \begin{bmatrix} 0 & 1 + \bar{t} & 0 & 0 & 0 & 1 & 0 & 0 & 0 \end{bmatrix} \cdot \hat{\Sigma} \cdot \begin{bmatrix} 0 \\1 + \bar{t} \\ 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0 \\ 0 \end{bmatrix} \end{split} \tag{12} \end{equation}\]

\[\begin{equation} \begin{split} SE(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}}))) = \sqrt{Var(g(f(\mathbf{x}_{it}, \boldsymbol{\hat{\beta}})))} \end{split} \tag{13} \end{equation}\]

The approximated variance of the sampling distribution of the estimate in the simulated data is shown below.

sigmaEst <- vcov(fit)[c("(Intercept)", "C", "Time2", "Time3", "C:Time2", "C:Time3", "gamma1", "gamma2", "gamma3"), c("(Intercept)", "C", "Time2", "Time3", "C:Time2", "C:Time3", "gamma1", "gamma2", "gamma3")]

grad <- matrix(c(0, 1 + 1/3, 0, 0, 0, 1, 0, 0, 0))

uncond_t3_effect_se <- sqrt(t(grad) %*% sigmaEst %*% grad)
uncond_t3_effect_se
##           [,1]
## [1,] 0.1906231

Inference and Confidence Intervals

With the estimate and standard error in hand, statistical inference is straight forward. Let the estimated treatment effect at time 3 unconditional on time 2 be given by \(\hat{D}\) so that it’s standard error is given by \(SE(\hat{D})\). Under the null hypothesis, we wish to test \(H_0: \hat{D} = 0\). If the null is true, and by the central limit theorem, the sampling distribution of \(\frac{\hat{D}}{SE(\hat{D})}\) converges in distribution to a standard normal, \(Z \sim N(0, 1)\). Therefore our observed test statistic under the null is as seen in equation (14).

\[\begin{equation} \begin{split} Z_{obs} = \frac{\hat{D} - \hat{D}_{H_0}}{SE(\hat{D})} = \frac{\hat{D}}{SE(\hat{D})} \end{split} \tag{14} \end{equation}\]

The P-value to test the null hypothesis is two times the upper tail probability of observing \(Z_{obs}\) or greater. Assuming that we are willing falsely reject the null hypothesis 5% of the time, the corresponding 95% confidence interval is \(\hat{D} \pm 1.96SE(\hat{D})\).

# P-value
z_obs <- uncond_t3_effect/uncond_t3_effect_se
pval <- 2*pnorm(abs(z_obs), lower.tail = FALSE)
pval
##              [,1]
## [1,] 1.147179e-76
# 95% Confidence Interval
ci_ub <- uncond_t3_effect + 1.96*uncond_t3_effect_se
ci_lb <- uncond_t3_effect - 1.96*uncond_t3_effect_se
paste("[", round(ci_ub, 2), ", ", round(ci_lb, 2), "]", sep = "")
## [1] "[3.91, 3.16]"

Session Info

sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## attached base packages:
## [1] stats     graphics  grDevices datasets  utils     methods   base     
## 
## other attached packages:
## [1] geepack_1.3-1    mvtnorm_1.1-1    DT_0.16          dplyr_1.0.2     
## [5] reshape2_1.4.4   kableExtra_1.3.1 bookdown_0.21    knitr_1.30      
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.5        compiler_4.0.3    pillar_1.4.6      plyr_1.8.6       
##  [5] tools_4.0.3       digest_0.6.27     jsonlite_1.7.1    evaluate_0.14    
##  [9] lifecycle_0.2.0   tibble_3.0.4      viridisLite_0.3.0 pkgconfig_2.0.3  
## [13] rlang_0.4.8       rstudioapi_0.13   crosstalk_1.1.0.1 yaml_2.2.1       
## [17] xfun_0.19         httr_1.4.2        stringr_1.4.0     xml2_1.3.2       
## [21] htmlwidgets_1.5.2 generics_0.1.0    vctrs_0.3.5       webshot_0.5.2    
## [25] tidyselect_1.1.0  glue_1.4.2        R6_2.5.0          rmarkdown_2.5    
## [29] tidyr_1.1.2       purrr_0.3.4       magrittr_2.0.1    backports_1.2.0  
## [33] MASS_7.3-53       scales_1.1.1      htmltools_0.5.0   ellipsis_0.3.1   
## [37] rvest_0.3.6       colorspace_2.0-0  renv_0.12.2       stringi_1.5.3    
## [41] munsell_0.5.0     broom_0.7.2       crayon_1.3.4

References

Bain, Lee J, and Max Engelhardt. 1992. Introduction to Probability and Mathematical Statistics. Brooks/Cole.

Doob, Joseph L. 1935. “The Limiting Distributions of Certain Statistics.” The Annals of Mathematical Statistics 6 (3): 160–69.

Dowd, Bryan E, William H Greene, and Edward C Norton. 2014. “Computation of Standard Errors.” Health Services Research 49 (2): 731–50.

Genz, Alan, and Frank Bretz. 2009. Computation of Multivariate Normal and T Probabilities. Lecture Notes in Statistics. Heidelberg: Springer-Verlag.

Genz, Alan, Frank Bretz, Tetsuhisa Miwa, Xuefei Mi, Friedrich Leisch, Fabian Scheipl, and Torsten Hothorn. 2019. mvtnorm: Multivariate Normal and T Distributions. https://CRAN.R-project.org/package=mvtnorm.

Halekoh, Ulrich, Søren Højsgaard, and Jun Yan. 2006. “The R Package Geepack for Generalized Estimating Equations.” Journal of Statistical Software 15/2: 1–11.

Hernán, Miguel A, and James M Robins. 2006. “Estimating Causal Effects from Epidemiological Data.” Journal of Epidemiology & Community Health 60 (7): 578–86.

Liang, Kung-Yee, and Scott L Zeger. 1986. “Longitudinal Data Analysis Using Generalized Linear Models.” Biometrika 73 (1): 13–22.

R Core Team. 2019. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

RStudio Team. 2016. RStudio: Integrated Development Environment for R. Boston, MA: RStudio, Inc. http://www.rstudio.com/.

White, Halbert. 1980. “A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity.” Econometrica: Journal of the Econometric Society, 817–38.

Wickham, Hadley. 2007. “Reshaping Data with the reshape Package.” Journal of Statistical Software 21 (12): 1–20. http://www.jstatsoft.org/v21/i12/.

Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller. 2019. Dplyr: A Grammar of Data Manipulation. https://CRAN.R-project.org/package=dplyr.

Xie, Yihui. 2018. Bookdown: Authoring Books and Technical Documents with R Markdown. https://github.com/rstudio/bookdown.

———. 2019. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://yihui.name/knitr/.

Xie, Yihui, Joe Cheng, and Xianying Tan. 2019. DT: A Wrapper of the Javascript Library ’Datatables’. https://CRAN.R-project.org/package=DT.

Yan, Jun. 2002. “Geepack: Yet Another Package for Generalized Estimating Equations.” R-News 2/3: 12–14.

Yan, Jun, and Jason P. Fine. 2004. “Estimating Equations for Association Structures.” Statistics in Medicine 23: 859–80.

Zhu, Hao. 2019. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.