Safely construct a simple Wilkinson notation formula from the outcome (dependent variable) name and vector of input (independent variable) names.

mk_formula( outcome, variables, ..., intercept = TRUE, outcome_target = NULL, outcome_comparator = "==", env = baseenv(), extra_values = NULL )

outcome | character scalar, name of outcome or dependent variable. |
---|---|

variables | character vector, names of input or independent variables. |

... | not used, force later arguments to bind by name. |

intercept | logical, if TRUE allow an intercept term. |

outcome_target | scalar, if not NULL write outcome==outcome_target in formula. |

outcome_comparator | one of "==", "!=", ">=", "<=", ">", "<", only use of outcome_target is not NULL. |

env | environment to use in formula (unless extra_values is non empty, then this is a parent environemnt). |

extra_values | if not empty extra values to be added to a new formula environment containing env. |

a formula object

Note: outcome and variables
are each intended to be simple variable names or column names (or .). They are not
intended to specify
interactions, I()-terms, transforms, general experessions or other complex formula terms.
Essentially the same effect as `reformulate`

, but trying to avoid the
`paste`

currently in `reformulate`

by calling `update.formula`

(which appears to work over terms).
Another reasonable way to do this is just `paste(outcome, paste(variables, collapse = " + "), sep = " ~ ")`

.

Care must be taken with later arguments to functions like `lm()`

whose help states:
"All of weights, subset and offset are evaluated in the same way as variables in formula, that is first in data and then in the environment of formula."
Also note `env`

defaults to `baseenv()`

to try and minimize refence leaks produced by the environemnt
captured by the formal ending up stored in the resulting model for `lm()`

and `glm()`

. For
behavior closer to `as.formula()`

please set the `env`

argument to `parent.frame()`

.

#> mpg ~ cyl + disp #> <environment: base>#> #> Call: #> lm(formula = f, data = mtcars) #> #> Coefficients: #> (Intercept) cyl disp #> 34.66099 -1.58728 -0.02058 #>#> [1] "mpg ~ cyl + disp"#> (cyl >= 8) ~ wt + gear #> <environment: base>#> #> Call: glm(formula = f, family = binomial, data = mtcars) #> #> Coefficients: #> (Intercept) wt gear #> -33.9388 9.3992 0.5893 #> #> Degrees of Freedom: 31 Total (i.e. Null); 29 Residual #> Null Deviance: 43.86 #> Residual Deviance: 14.21 AIC: 20.21#> [1] "(cyl >= 8) ~ wt + gear"