Skip to contents

Provide a formula, variables and a family to generate a linear predictor using the formula and provided variables before using the inverse link of the family to generate the GLM modelled mean, mu, which is then used to simulate the response with this mean from the generating function according to the chosen family.

Usage

glm_data(formula, ..., family = gaussian(), family_args = list(sd = 1))

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’ in the glm documentation.

...

a data.frame with columns corresponding to variables used in formula, a named list of those variables, or individually provided named arguments of variables from

family

the family of the response. this can be a character string naming a family function, a family function or the result of a call to a family function

family_args

a named list with values of arguments passed to family relevant r<family_name> function for simulating the data

Value

a data.frame

Examples

# Generate a gaussian response from a single covariate
glm_data(Y ~ 1+2*x1,
                x1 = rnorm(10))
#>            Y          x1
#> 1  -1.449147 -0.89480241
#> 2   2.865960  0.90426912
#> 3  -1.648712  0.07964921
#> 4  -2.429914 -1.25882722
#> 5   2.268991  1.02568511
#> 6  -1.125662 -0.73077860
#> 7   1.246019 -0.19014551
#> 8   1.550481  0.52886469
#> 9   2.370782  0.55021053
#> 10  2.566846  0.54968434

# Generate a gaussian response from a single covariate with non-linear
# effects. Specify that the response should have standard deviation sqrt(3)
glm_data(Y ~ 1+2*abs(sin(x1)),
                x1 = runif(10, min = -2, max = 2),
                family_args = list(sd = sqrt(3)))
#>            Y         x1
#> 1  5.8875239  1.0618623
#> 2  1.8274380  0.5078665
#> 3  4.9617796  0.9213538
#> 4  0.5141856 -0.3051824
#> 5  2.2240450 -0.9266769
#> 6  1.5760863  1.0761921
#> 7  0.9143955  0.3493867
#> 8  4.0499201  1.5942710
#> 9  4.0731122  1.4812982
#> 10 0.5794237 -0.4027751

# Generate a negative binomial response
glm_data(Y ~ 1+2*x1-x2,
                x1 = rnorm(10),
                x2 = rgamma(10, shape = 2),
                family = MASS::negative.binomial(2))
#>     Y         x1        x2
#> 1   0 -0.7836391 2.3050987
#> 2   1 -0.9531239 1.3214499
#> 3   0  1.7927561 3.7469426
#> 4   0  0.3489767 1.8461783
#> 5   2  0.2591038 2.0354566
#> 6   0 -0.8059519 0.6438993
#> 7   4  0.1056647 0.7646888
#> 8   2 -0.3335997 0.9691220
#> 9  20  1.6418480 0.6479179
#> 10  0 -0.6439059 0.5980742

# Provide variables as a list/data.frame
glm_data(resp ~ 1+2*x1-x2,
                data.frame(
                  x1 = rnorm(10),
                  x2 = rgamma(10, shape = 2)
                ),
                family = MASS::negative.binomial(2))
#>    resp          x1        x2
#> 1     0 -0.05367151 2.0555178
#> 2     0 -0.56352463 4.8979183
#> 3     0 -0.74390896 0.9440099
#> 4     0 -0.10904165 0.3404743
#> 5     0 -0.56082923 2.1408569
#> 6     0  0.18800155 0.6841673
#> 7     1  0.74885094 1.8483538
#> 8     0 -1.91653832 0.6336780
#> 9     2  0.23609585 0.2305099
#> 10    2  0.62895342 1.4685068