Examples for Proofing statistics in papers. Uses formatting R package sigr. Please see here and here for some notes.

library('sigr') # devtools::install_github('WinVector/sigr')

Example showing we can get the items reported by summary(model) into one well-behaved string.

library('ggplot2')
d <- data.frame(x=0.2*(1:20))
d$y <- cos(d$x)
model <- lm(y~x,data=d)
d$prediction <- predict(model,newdata=d) ggplot(data=d,aes(x=prediction,y=y)) + geom_point() + geom_abline()  print(summary(model)) ## ## Call: ## lm(formula = y ~ x, data = d) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.34441 -0.24493 0.00103 0.18320 0.65001 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.95686 0.12869 7.436 6.84e-07 *** ## x -0.56513 0.05371 -10.521 4.06e-09 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.277 on 18 degrees of freedom ## Multiple R-squared: 0.8601, Adjusted R-squared: 0.8524 ## F-statistic: 110.7 on 1 and 18 DF, p-value: 4.062e-09 Examples showing how to get the summary from the model. formatFTest(model,pSmallCutoff=1.0e-12)$test [1] “F Test”

$numdf [1] 1$dendf [1] 18

$FValue [1] 110.7003$format [1] “html”

$pLargeCutoff [1] 0.05$pSmallCutoff [1] 1e-12

$R2 [1] 0.8601402$pValue [1] 4.062177e-09

$formatStr [1] “F Test summary: (R2=0.86, F(1,18)=1.1e+02, p=4.1e-09).” Or frome the data. formatFTest(d,'prediction','y', pSmallCutoff=1.0e-12)$test [1] “F Test”

$numdf [1] 1$dendf [1] 18

$FValue [1] 110.7003$format [1] “html”

$pLargeCutoff [1] 0.05$pSmallCutoff [1] 1e-12

$R2 [1] 0.8601402$pValue [1] 4.062177e-09

$formatStr [1] “F Test summary: (R2=0.86, F(1,18)=1.1e+02, p=4.1e-09).” Separate extracted example from https://web2.uconn.edu/writingcenter/pdf/Reporting_Statistics.pdf. formatFTestImpl(numdf=2,dendf=55,FValue=5.56)$test [1] “F Test”

$numdf [1] 2$dendf [1] 55

$FValue [1] 5.56$format [1] “html”

$pLargeCutoff [1] 0.05$pSmallCutoff [1] 1e-05

$R2 [1] 0.1681791$pValue [1] 0.006321509

\$formatStr [1] “F Test summary: (R2=0.17, F(2,55)=5.6, p=0.0063).”

Looks like statCheck checks the p-value, but not the R-squared.

library('statcheck')
s <- statcheck('(R2=.38, F(2,55)=5.56, p < .01)')
## Extracting statistics...
##
|
|                                                                 |   0%
|
|=================================================================| 100%
print(xtable::xtable(t(s)),type='html')
x
Source 1
Statistic F
df1 2
df2 55
Test.Comparison =
Value 5.56
Reported.Comparison <
Reported.P.Value 0.01
Computed 0.006321509
Raw F(2,55)=5.56, p < .01
Error FALSE
DecisionError FALSE
OneTail FALSE
OneTailedInTxt FALSE
APAfactor 1

Check ours.

s <- statcheck('(R2=0.17, F(2,55)=5.56, p=0.00632)')
## Extracting statistics...
##
|
|                                                                 |   0%
|
|=================================================================| 100%
print(xtable::xtable(t(s)),type='html')
x
Source 1
Statistic F
df1 2
df2 55
Test.Comparison =
Value 5.56
Reported.Comparison =
Reported.P.Value 0.00632
Computed 0.006321509
Raw F(2,55)=5.56, p=0.00632
Error FALSE
DecisionError FALSE
OneTail FALSE
OneTailedInTxt FALSE
APAfactor 1