
Simulate from continuous outcome model given covariates
Source:R/outcome_models.R
outcome_continuous.RdSimulate from continuous outcome model with mean $$g(\text{par}^\top X)$$ where \(X\) is the design matrix specified by the formula, and \(g\) is the link function specified by the family argument
Arguments
- data
(data.table) Covariate data, usually the output of the covariate model of a Trial object.
- mean
(formula, function) Either a formula specifying the design from 'data' or a function that maps
datato the conditional mean value on the link scale (see examples). If NULL all main-effects of the covariates will be used, except columns that are defined via theremoveargument.- par
(numeric) Regression coefficients (default zero). Can be given as a named list corresponding to the column names of
model.matrix- sd
(numeric) standard deviation of Gaussian measurement error
- het
Introduce variance hetereogeneity by adding a residual term \(het \cdot \mu_x \cdot e\), where \(\mu_x\) is the mean given covariates and \(e\) is an independent standard normal distributed variable. This term is in addition to the measurement error introduced by the
sdargument.- outcome.name
Name of outcome variable ("y")
- remove
Variables that will be removed from input
data(if formula is not specified).- family
exponential family (default
gaussian(identity))- ...
Additional arguments passed to
meanfunction (see examples)
Examples
trial <- Trial$new(
covariates = \(n) data.frame(a = rbinom(n, 1, 0.5), x = rnorm(n)),
outcome = outcome_continuous
)
est <- function(data) glm(y ~ a + x, data = data)
trial$simulate(1e4, mean = ~ 1 + a + x, par = c(1, 0.5, 2)) |> est()
#>
#> Call: glm(formula = y ~ a + x, data = data)
#>
#> Coefficients:
#> (Intercept) a x
#> 0.9991 0.5112 1.9935
#>
#> Degrees of Freedom: 9999 Total (i.e. Null); 9997 Residual
#> Null Deviance: 50740
#> Residual Deviance: 9974 AIC: 28360
# default behavior is to set all regression coefficients to 0
trial$simulate(1e4, mean = ~ 1 + a + x) |> est()
#>
#> Call: glm(formula = y ~ a + x, data = data)
#>
#> Coefficients:
#> (Intercept) a x
#> 0.020756 -0.035260 0.004824
#>
#> Degrees of Freedom: 9999 Total (i.e. Null); 9997 Residual
#> Null Deviance: 9879
#> Residual Deviance: 9875 AIC: 28260
# intercept defaults to 0 and regression coef for a takes the provided value
trial$simulate(1e4, mean = ~ 1 + a, par = c(a = 0.5)) |> est()
#>
#> Call: glm(formula = y ~ a + x, data = data)
#>
#> Coefficients:
#> (Intercept) a x
#> 0.018315 0.481916 0.002943
#>
#> Degrees of Freedom: 9999 Total (i.e. Null); 9997 Residual
#> Null Deviance: 10560
#> Residual Deviance: 9977 AIC: 28360
# trial$simulate(1e4, mean = ~ 1 + a, par = c("(Intercept)" = 0.5)) |> est()
# define mean model that directly works on whole covariate data, incl id and
# num columns
trial$simulate(1e4, mean = \(x) with(x, -1 + a * 2 + x * -3)) |>
est()
#>
#> Call: glm(formula = y ~ a + x, data = data)
#>
#> Coefficients:
#> (Intercept) a x
#> -0.9779 1.9655 -2.9914
#>
#> Degrees of Freedom: 9999 Total (i.e. Null); 9997 Residual
#> Null Deviance: 110600
#> Residual Deviance: 9981 AIC: 28370
# par argument is not passed on to mean function
trial$simulate(1e4,
mean = \(x, reg.par) with(x, reg.par[1] + reg.par[2] * a),
reg.par = c(1, 5)
) |> est()
#>
#> Call: glm(formula = y ~ a + x, data = data)
#>
#> Coefficients:
#> (Intercept) a x
#> 1.0016 4.9941 0.0163
#>
#> Degrees of Freedom: 9999 Total (i.e. Null); 9997 Residual
#> Null Deviance: 72210
#> Residual Deviance: 9855 AIC: 28240