Examples for Proofing statistics in papers. Uses formatting R package sigr. Please see here and here for some notes.

library('sigr') # devtools::install_github('WinVector/sigr')

Example showing we can get the items reported by summary(model) into one well-behaved string.

library('ggplot2')
d <- data.frame(x=0.2*(1:20))
d$y <- cos(d$x)
model <- lm(y~x,data=d)
d$prediction <- predict(model,newdata=d)

ggplot(data=d,aes(x=prediction,y=y)) +
  geom_point() + geom_abline() 

print(summary(model))
## 
## Call:
## lm(formula = y ~ x, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.34441 -0.24493  0.00103  0.18320  0.65001 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.95686    0.12869   7.436 6.84e-07 ***
## x           -0.56513    0.05371 -10.521 4.06e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.277 on 18 degrees of freedom
## Multiple R-squared:  0.8601, Adjusted R-squared:  0.8524 
## F-statistic: 110.7 on 1 and 18 DF,  p-value: 4.062e-09

Examples showing how to get the summary from the model.

formatFTest(model,pSmallCutoff=1.0e-12)

$test [1] “F Test”

$numdf [1] 1

$dendf [1] 18

$FValue [1] 110.7003

$format [1] “html”

$pLargeCutoff [1] 0.05

$pSmallCutoff [1] 1e-12

$R2 [1] 0.8601402

$pValue [1] 4.062177e-09

$formatStr [1] “F Test summary: (R2=0.86, F(1,18)=1.1e+02, p=4.1e-09).”

Or frome the data.

formatFTest(d,'prediction','y',
                    pSmallCutoff=1.0e-12)

$test [1] “F Test”

$numdf [1] 1

$dendf [1] 18

$FValue [1] 110.7003

$format [1] “html”

$pLargeCutoff [1] 0.05

$pSmallCutoff [1] 1e-12

$R2 [1] 0.8601402

$pValue [1] 4.062177e-09

$formatStr [1] “F Test summary: (R2=0.86, F(1,18)=1.1e+02, p=4.1e-09).”

Separate extracted example from https://web2.uconn.edu/writingcenter/pdf/Reporting_Statistics.pdf.

formatFTestImpl(numdf=2,dendf=55,FValue=5.56)

$test [1] “F Test”

$numdf [1] 2

$dendf [1] 55

$FValue [1] 5.56

$format [1] “html”

$pLargeCutoff [1] 0.05

$pSmallCutoff [1] 1e-05

$R2 [1] 0.1681791

$pValue [1] 0.006321509

$formatStr [1] “F Test summary: (R2=0.17, F(2,55)=5.6, p=0.0063).”

Looks like statCheck checks the p-value, but not the R-squared.

library('statcheck')
s <- statcheck('(R2=.38, F(2,55)=5.56, p < .01)')
## Extracting statistics...
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
print(xtable::xtable(t(s)),type='html')
x
Source 1
Statistic F
df1 2
df2 55
Test.Comparison =
Value 5.56
Reported.Comparison <
Reported.P.Value 0.01
Computed 0.006321509
Raw F(2,55)=5.56, p < .01
Error FALSE
DecisionError FALSE
OneTail FALSE
OneTailedInTxt FALSE
APAfactor 1

Check ours.

s <- statcheck('(R2=0.17, F(2,55)=5.56, p=0.00632)')
## Extracting statistics...
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |=================================================================| 100%
print(xtable::xtable(t(s)),type='html')
x
Source 1
Statistic F
df1 2
df2 55
Test.Comparison =
Value 5.56
Reported.Comparison =
Reported.P.Value 0.00632
Computed 0.006321509
Raw F(2,55)=5.56, p=0.00632
Error FALSE
DecisionError FALSE
OneTail FALSE
OneTailedInTxt FALSE
APAfactor 1