`R/PRTPlot.R`

`PRTPlot.Rd`

Plot classifier performance metrics as a function of threshold.

PRTPlot( frame, predVar, truthVar, truthTarget, title, ..., plotvars = c("precision", "recall"), thresholdrange = c(-Inf, Inf), linecolor = "black" )

frame | data frame to get values from |
---|---|

predVar | name of the column of predicted scores |

truthVar | name of the column of actual outcomes in frame |

truthTarget | value we consider to be positive |

title | title to place on plot |

... | no unnamed argument, added to force named binding of later arguments. |

plotvars | variables to plot, must be at least one of the measures listed below. Defaults to c("precision", "recall") |

thresholdrange | range of thresholds to plot. |

linecolor | line color for the plot |

For a classifier, the precision is what fraction of predicted positives are true positives; the recall is what fraction of true positives the classifier finds, and the enrichment is the ratio of classifier precision to the average rate of positives. Plotting precision-recall or enrichment-recall as a function of classifier score helps identify a score threshold that achieves an acceptable tradeoff between precision and recall, or enrichment and recall.

In addition to precision/recall, `PRTPlot`

can plot a number of other metrics:

precision: fraction of predicted positives that are true positives

recall: fraction of true positives that were predicted to be true

enrichment: ratio of classifier precision to prevalence of positive class

sensitivity: the same as recall (also known as the true positive rate)

specificity: fraction of true negatives to all negatives (or 1 - false_positive_rate)

false_positive_rate: fraction of negatives predicted to be true over all negatives

For example, plotting sensitivity/false_positive_rate as functions of threshold will "unroll" an ROC Plot.

Plots are in a single column, in the order specified by `plotvars`

.

df <- iris df$isVersicolor <- with(df, Species=='versicolor') model = glm(isVersicolor ~ Petal.Length + Petal.Width + Sepal.Length + Sepal.Width, data=df, family=binomial) df$pred = predict(model, newdata=df, type="response") WVPlots::PRTPlot(df, "pred", "isVersicolor", TRUE, title="Example Precision-Recall threshold plot")WVPlots::PRTPlot(df, "pred", "isVersicolor", TRUE, plotvars = c("sensitivity", "specificity", "false_positive_rate"), title="Sensitivity/specificity/FPR as functions of threshold")