Construct a formula.

Safely construct a simple Wilkinson notation formula from the outcome (dependent variable) name and vector of input (independent variable) names.

mk_formula(
  outcome,
  variables,
  ...,
  intercept = TRUE,
  outcome_target = NULL,
  outcome_comparator = "==",
  env = baseenv(),
  extra_values = NULL,
  as_character = FALSE
)

Arguments

outcome	character scalar, name of outcome or dependent variable.
variables	character vector, names of input or independent variables.
...	not used, force later arguments to bind by name.
intercept	logical, if TRUE allow an intercept term.
outcome_target	scalar, if not NULL write outcome==outcome_target in formula.
outcome_comparator	one of "==", "!=", ">=", "<=", ">", "<", only use of outcome_target is not NULL.
env	environment to use in formula (unless extra_values is non empty, then this is a parent environemnt).
extra_values	if not empty extra values to be added to a new formula environment containing env.
as_character	if TRUE return formula as a character string.

Value

a formula object

Details

Note: outcome and variables are each intended to be simple variable names or column names (or .). They are not intended to specify interactions, I()-terms, transforms, general experessions or other complex formula terms. Essentially the same effect as reformulate, but trying to avoid the paste currently in reformulate by calling update.formula (which appears to work over terms). Another reasonable way to do this is just paste(outcome, paste(variables, collapse = " + "), sep = " ~ ").

Care must be taken with later arguments to functions like lm() whose help states: "All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula." Also note env defaults to baseenv() to try and minimize refence leaks produced by the environemnt captured by the formal ending up stored in the resulting model for lm() and glm(). For behavior closer to as.formula() please set the env argument to parent.frame().

Examples


f <- mk_formula("mpg", c("cyl", "disp"))
print(f)
#> mpg ~ cyl + disp
#> <environment: base>
(model <- lm(f, mtcars))
#> 
#> Call:
#> lm(formula = f, data = mtcars)
#> 
#> Coefficients:
#> (Intercept)          cyl         disp  
#>    34.66099     -1.58728     -0.02058  
#> 
format(model$terms)
#> [1] "mpg ~ cyl + disp"

f <- mk_formula("cyl", c("wt", "gear"), outcome_target = 8, outcome_comparator = ">=")
print(f)
#> (cyl >= 8) ~ wt + gear
#> <environment: base>
(model <- glm(f, mtcars, family = binomial))
#> 
#> Call:  glm(formula = f, family = binomial, data = mtcars)
#> 
#> Coefficients:
#> (Intercept)           wt         gear  
#>    -33.9388       9.3992       0.5893  
#> 
#> Degrees of Freedom: 31 Total (i.e. Null);  29 Residual
#> Null Deviance:	    43.86 
#> Residual Deviance: 14.21 	AIC: 20.21
format(model$terms)
#> [1] "(cyl >= 8) ~ wt + gear"

Arguments

Value

Details

See also

Examples