Examples for Proofing statistics in papers. Uses formatting R package sigr. Please see here and here for some notes.
library('sigr') # devtools::install_github('WinVector/sigr')
Example showing we can get the items reported by summary(model)
into one well-behaved string.
library('ggplot2')
d <- data.frame(x=0.2*(1:20))
d$y <- cos(d$x)
model <- lm(y~x,data=d)
d$prediction <- predict(model,newdata=d)
ggplot(data=d,aes(x=prediction,y=y)) +
geom_point() + geom_abline()
print(summary(model))
##
## Call:
## lm(formula = y ~ x, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.34441 -0.24493 0.00103 0.18320 0.65001
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.95686 0.12869 7.436 6.84e-07 ***
## x -0.56513 0.05371 -10.521 4.06e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.277 on 18 degrees of freedom
## Multiple R-squared: 0.8601, Adjusted R-squared: 0.8524
## F-statistic: 110.7 on 1 and 18 DF, p-value: 4.062e-09
Examples showing how to get the summary from the model.
formatFTest(model,pSmallCutoff=1.0e-12)
$test [1] “F Test”
$numdf [1] 1
$dendf [1] 18
$FValue [1] 110.7003
$format [1] “html”
$pLargeCutoff [1] 0.05
$pSmallCutoff [1] 1e-12
$R2 [1] 0.8601402
$pValue [1] 4.062177e-09
$formatStr [1] “F Test summary: (R2=0.86, F(1,18)=1.1e+02, p=4.1e-09).”
Or frome the data.
formatFTest(d,'prediction','y',
pSmallCutoff=1.0e-12)
$test [1] “F Test”
$numdf [1] 1
$dendf [1] 18
$FValue [1] 110.7003
$format [1] “html”
$pLargeCutoff [1] 0.05
$pSmallCutoff [1] 1e-12
$R2 [1] 0.8601402
$pValue [1] 4.062177e-09
$formatStr [1] “F Test summary: (R2=0.86, F(1,18)=1.1e+02, p=4.1e-09).”
Separate extracted example from https://web2.uconn.edu/writingcenter/pdf/Reporting_Statistics.pdf.
formatFTestImpl(numdf=2,dendf=55,FValue=5.56)
$test [1] “F Test”
$numdf [1] 2
$dendf [1] 55
$FValue [1] 5.56
$format [1] “html”
$pLargeCutoff [1] 0.05
$pSmallCutoff [1] 1e-05
$R2 [1] 0.1681791
$pValue [1] 0.006321509
$formatStr [1] “F Test summary: (R2=0.17, F(2,55)=5.6, p=0.0063).”
Looks like statCheck checks the p-value, but not the R-squared.
library('statcheck')
s <- statcheck('(R2=.38, F(2,55)=5.56, p < .01)')
## Extracting statistics...
##
|
| | 0%
|
|=================================================================| 100%
print(xtable::xtable(t(s)),type='html')
x | |
---|---|
Source | 1 |
Statistic | F |
df1 | 2 |
df2 | 55 |
Test.Comparison | = |
Value | 5.56 |
Reported.Comparison | < |
Reported.P.Value | 0.01 |
Computed | 0.006321509 |
Raw | F(2,55)=5.56, p < .01 |
Error | FALSE |
DecisionError | FALSE |
OneTail | FALSE |
OneTailedInTxt | FALSE |
APAfactor | 1 |
Check ours.
s <- statcheck('(R2=0.17, F(2,55)=5.56, p=0.00632)')
## Extracting statistics...
##
|
| | 0%
|
|=================================================================| 100%
print(xtable::xtable(t(s)),type='html')
x | |
---|---|
Source | 1 |
Statistic | F |
df1 | 2 |
df2 | 55 |
Test.Comparison | = |
Value | 5.56 |
Reported.Comparison | = |
Reported.P.Value | 0.00632 |
Computed | 0.006321509 |
Raw | F(2,55)=5.56, p=0.00632 |
Error | FALSE |
DecisionError | FALSE |
OneTail | FALSE |
OneTailedInTxt | FALSE |
APAfactor | 1 |