Plot a scatter plot with optional smoothing curves or contour lines, and marginal histogram/density plots.
Based on https://win-vector.com/2015/06/11/wanted-a-perfect-scatterplot-with-marginals/.
See also ggExtra::ggMarginal
.
ScatterHist( frame, xvar, yvar, title, ..., smoothmethod = "lm", estimate_sig = FALSE, minimal_labels = TRUE, binwidth_x = NULL, binwidth_y = NULL, adjust_x = 1, adjust_y = 1, point_alpha = 0.5, contour = FALSE, point_color = "black", hist_color = "gray", smoothing_color = "blue", density_color = "blue", contour_color = "blue" )
frame | data frame to get values from |
---|---|
xvar | name of the independent (input or model) column in frame |
yvar | name of the dependent (output or result to be modeled) column in frame |
title | title to place on plot |
... | no unnamed argument, added to force named binding of later arguments. |
smoothmethod | (optional) one of 'auto', 'loess', 'gam', 'lm', 'identity', or 'none'. |
estimate_sig | logical if TRUE and smoothmethod is 'identity' or 'lm', report goodness of fit and significance of relation. |
minimal_labels | logical drop some annotations |
binwidth_x | numeric binwidth for x histogram |
binwidth_y | numeric binwidth for y histogram |
adjust_x | numeric adjust x density plot |
adjust_y | numeric adjust y density plot |
point_alpha | numeric opaqueness of the plot points |
contour | logical if TRUE add a 2d contour plot |
point_color | color for scatter plots |
hist_color | fill color for marginal histograms |
smoothing_color | color for smoothing line |
density_color | color for marginal density plots |
contour_color | color for contour plots |
plot grid
If smoothmethod
is:
'auto', 'loess' or 'gam': the appropriate smoothing curve is added to the scatterplot.
'lm' (the default): the best fit line is added to the scatterplot.
'identity': the line x = y is added to the scatterplot. This is useful for comparing model predictions to true outcome.
'none': no smoothing line is added to the scatterplot.
If estimate_sig
is TRUE and smoothmethod
is:
'lm': the R-squared of the linear fit is reported.
'identity': the R-squared of the exact relation between xvar
and yvar
is reported.
Note that the identity R-squared is NOT the square of the correlation between xvar
and yvar
(which includes an implicit shift and scale). It is the coefficient of determination between xvar
and
yvar
, and can be negative. See https://en.wikipedia.org/wiki/Coefficient_of_determination for more details.
If xvar
is the output of a model to predict yvar
, then the identity R-squared, not the lm R-squared,
is the correct measure.
If smoothmethod
is neither 'lm' or 'identity' then estimate_sig
is ignored.
set.seed(34903490) x = rnorm(50) y = 0.5*x^2 + 2*x + rnorm(length(x)) frm = data.frame(x=x,y=y) WVPlots::ScatterHist(frm, "x", "y", title= "Example Fit", smoothmethod = "gam", contour = TRUE)# Same plot with custom colors WVPlots::ScatterHist(frm, "x", "y", title= "Example Fit", smoothmethod = "gam", contour = TRUE, point_color = "#006d2c", # dark green hist_color = "#6baed6", # medium blue smoothing_color = "#54278f", # dark purple density_color = "#08519c", # darker blue contour_color = "#9e9ac8") # lighter purple