Skip to contents

Simulate from count model with intensity $$\lambda = \text{exposure-time}\exp(\text{par}^\top X)$$ where \(X\) is the design matrix specified by the formula

Usage

outcome_count(
  data,
  mean = NULL,
  par = NULL,
  outcome.name = "y",
  exposure = 1,
  remove = c("id", "num"),
  zero.inflation = NULL,
  overdispersion = NULL,
  ...
)

Arguments

data

(data.table) Covariate data, usually the output of the covariate model of a Trial object.

mean

(formula, function) Either a formula specifying the design from 'data' or a function that maps data to the conditional mean value on the link scale (see examples). If NULL all main-effects of the covariates will be used, except columns that are defined via the remove argument.

par

(numeric) Regression coefficients (default zero). Can be given as a named list corresponding to the column names of model.matrix

outcome.name

Name of outcome variable ("y")

exposure

Exposure times. Either a scalar, vector or function.

remove

Variables that will be removed from input data (if formula is not specified).

zero.inflation

vector of probabilities or a function of the covariates 'x' including an extra column 'rate' with the rate parameter.

overdispersion

variance of gamma-frailty either given as a numeric vector or a function of the covariates 'x' with an extra column 'rate' holding the rate parameter 'rate'

...

Additional arguments passed to mean and exposure function

Value

data.table

Examples

covariates <- function(n) data.frame(a = rbinom(n, 1, 0.5), x = rnorm(n))
trial <- Trial$new(covariates = covariates, outcome = outcome_count)
trial$args_model(
  mean = ~ 1 + a + x,
  par = c(2.5, 0.65, 0),
  overdispersion = 1 / 2,
  exposure = 2 # identical exposure time for all subjects
)
est <- function(data) {
  glm(y ~ a + x + offset(log(exposure)), data, family = poisson())
}
trial$simulate(1e4) |> est()
#> 
#> Call:  glm(formula = y ~ a + x + offset(log(exposure)), family = poisson(), 
#>     data = data)
#> 
#> Coefficients:
#> (Intercept)            a            x  
#>     2.50752      0.64580      0.01017  
#> 
#> Degrees of Freedom: 9999 Total (i.e. Null);  9997 Residual
#> Null Deviance:	    212500 
#> Residual Deviance: 177100 	AIC: 227700

# intercept + coef for x default to 0 and regression coef for a takes
# the provided value
trial$simulate(1e4, par = c(a = 0.65)) |> est()
#> 
#> Call:  glm(formula = y ~ a + x + offset(log(exposure)), family = poisson(), 
#>     data = data)
#> 
#> Coefficients:
#> (Intercept)            a            x  
#>   -0.002571     0.662639    -0.004677  
#> 
#> Degrees of Freedom: 9999 Total (i.e. Null);  9997 Residual
#> Null Deviance:	    26780 
#> Residual Deviance: 23730 	AIC: 47740
# trial$simulate(1e4, mean = ~ 1 + a, par = c("(Intercept)" = 1))

# define mean model that directly works on whole covariate data, incl id and
# num columns
trial$simulate(1e4, mean = \(x) with(x, exp(1 + 0.5 * a))) |>
  est()
#> 
#> Call:  glm(formula = y ~ a + x + offset(log(exposure)), family = poisson(), 
#>     data = data)
#> 
#> Coefficients:
#> (Intercept)            a            x  
#>     1.00169      0.49170     -0.01293  
#> 
#> Degrees of Freedom: 9999 Total (i.e. Null);  9997 Residual
#> Null Deviance:	    48370 
#> Residual Deviance: 44140 	AIC: 78170

# treatment-dependent exposure times
trial$simulate(1e4, exposure = function(dd) 1 - 0.5 * dd$a) |>
  head()
#>       id     a           x   num     y exposure
#>    <num> <int>       <num> <num> <num>    <num>
#> 1:     1     0  0.14464836     0     6      1.0
#> 2:     2     1  1.58655169     0     1      0.5
#> 3:     3     0 -1.09227229     0     9      1.0
#> 4:     4     1 -0.08241159     0    40      0.5
#> 5:     5     0 -2.37101455     0     8      1.0
#> 6:     6     1  0.49851502     0    10      0.5