Please see https://win-vector.com/2014/05/30/trimming-the-fat-from-glm-models-in-r/ for discussion.
clean_fit_glm( outcome, variables, data, ..., family, intercept = TRUE, outcome_target = NULL, outcome_comparator = "==", weights = NULL, env = baseenv() )
outcome | character, name of outcome column. |
---|---|
variables | character, names of varaible columns. |
data | data.frame, training data. |
... | not used, force later arguments to be used by name |
family | passed to stats::glm() |
intercept | logical, if TRUE allow an intercept term. |
outcome_target | scalar, if not NULL write outcome==outcome_target in formula. |
outcome_comparator | one of "==", "!=", ">=", "<=", ">", "<", only use of outcome_target is not NULL. |
weights | passed to stats::glm() |
env | environment to work in. |
list(model=model, summary=summary)
mk_data_example <- function(k) { data.frame( x1 = rep(c("a", "a", "b", "b"), k), x2 = rep(c(0, 0, 0, 1), k), y = rep(1:4, k), yC = rep(c(FALSE, TRUE, TRUE, TRUE), k), stringsAsFactors = FALSE) } res_glm <- clean_fit_glm("yC", c("x1", "x2"), mk_data_example(1), family = binomial) length(serialize(res_glm$model, NULL))#> [1] 33777res_glm <- clean_fit_glm("yC", c("x1", "x2"), mk_data_example(10000), family = binomial) length(serialize(res_glm$model, NULL))#> [1] 33777#> 1 2 3 4 #> 0.5 0.5 1.0 1.0