..
Worked examples for talk: Producing and evaluating machine learning models. Lecture slides: CV.pdf Files: The files below are telegraphic examples used to generate the graphs and numbers in the presentation. Once can in principle work through them using R ( https://cran.r-project.org ), RStudio ( https://www.rstudio.com ), and the referenced packages. They are not complete tutorials, but used to generate the number for the included presentation slides. For a free video lecture on gradient boosting (one of the methods used) please see here: http://www.win-vector.com/blog/2015/11/free-gradient-boosting-lecture/ . For a description of the vtreat package (used for data preparation) please see here: http://www.win-vector.com/blog/2016/06/a-demonstration-of-vtreat-data-preparation/ . CV.pdf : lecture slides. project.Rproj : RStudio project file (see https://www.rstudio.com ). installH2O.R : Instructions to install h2o deep learning kit. kdd2009.Rmd : R knitr/r-markdown neural net fitting/scoring. kdd2009.html : HTML rendering of above file. KDD2009vtreat.Rmd : R knitr/r-markdown demonstration fitting/scoring. KDD2009vtreat.html : HTML rendering of above file. kdd2009tree.Rmd : R knitr/r-markdown decision tree fitting/scoring. kdd2009tree.html : HTML rendering of above file. kdd2009xgboost.Rmd : R knitr/r-markdown demonstration fitting/scoring. kdd2009xgboost.html : HTML rendering of above file. orange_small_train.data.gz : Example data. orange_small_train_churn.labels.txt : Example data.