`R/mult_class.R`

`mkCrossFrameMExperiment.Rd`

Please see `vignette("MultiClassVtreat", package = "vtreat")`

https://winvector.github.io/vtreat/articles/MultiClassVtreat.html.

mkCrossFrameMExperiment(d, vars, y_name, ..., weights = c(), minFraction = 0.02, smFactor = 0, rareCount = 0, rareSig = 1, collarProb = 0, codeRestriction = NULL, customCoders = NULL, scale = FALSE, doCollar = FALSE, splitFunction = NULL, ncross = 3, forceSplit = FALSE, catScaling = FALSE, y_dependent_treatments = c("catB"), verbose = FALSE, parallelCluster = NULL, use_parallel = TRUE)

d | data to learn from |
---|---|

vars | character, vector of indpendent variable column names. |

y_name | character, name of outcome column. |

... | not used, declared to forced named binding of later arguments |

weights | optional training weights for each row |

minFraction | optional minimum frequency a categorical level must have to be converted to an indicator column. |

smFactor | optional smoothing factor for impact coding models. |

rareCount | optional integer, allow levels with this count or below to be pooled into a shared rare-level. Defaults to 0 or off. |

rareSig | optional numeric, suppress levels from pooling at this significance value greater. Defaults to NULL or off. |

collarProb | what fraction of the data (pseudo-probability) to collar data at if doCollar is set during |

codeRestriction | what types of variables to produce (character array of level codes, NULL means no restriction). |

customCoders | map from code names to custom categorical variable encoding functions (please see https://github.com/WinVector/vtreat/blob/master/extras/CustomLevelCoders.md). |

scale | optional if TRUE replace numeric variables with regression ("move to outcome-scale"). |

doCollar | optional if TRUE collar numeric variables by cutting off after a tail-probability specified by collarProb during treatment design. |

splitFunction | (optional) see vtreat::buildEvalSets . |

ncross | optional scalar>=2 number of cross-validation rounds to design. |

forceSplit | logical, if TRUE force cross-validated significance calculations on all variables. |

catScaling | optional, if TRUE use glm() linkspace, if FALSE use lm() for scaling. |

y_dependent_treatments | character what treatment types to build per-outcome level. |

verbose | if TRUE print progress. |

parallelCluster | (optional) a cluster object created by package parallel or package snow. |

use_parallel | logical, if TRUE use parallel methods. |

list(cross_frame, treatments_0, treatments_m)