Assuming max(nA, nB) %% min(nA, nB) == 0: compute the distribution of differences of weighted sums between max(1, nB/nA)*sum(a) and max(1, nA/nB)*sum(b) where a is a 0/1 vector of length nA with each item 1 with independent probability (kA+kB)/(nA+nB), and b is a 0/1 vector of length nB with each item 1 with independent probability (kA+kB)/(nA+nB). Then return the significance of a direct two-sided test that the absolute value of this difference is at least as large as the test_rate_difference (if supplied) or the empirically observed rate difference abs(nB*kA - nA*kB)/(nA*nB). The idea is: under this scaling differences in success rates between the two processes are easily observed as differences in counts returned by the scaled processes. The method can be used to get the exact probability of a given difference under the null hypothesis that both the A and B processes have the same success rate (kA+kB)/(nA+nB). When nA and nB don't divide evenly into to each other two calculations are run with the larger process is alternately padded and truncated to look like a larger or smaller experiment that meets the above conditions. This gives us a good range of significances.

Bernoulli_diff_stat(kA, nA, kB, nB, test_rate_difference, common_rate)

Arguments

kA

number of A successes observed.

nA

number of A experiments.

kB

number of B successes observed.

nB

number of B experiments.

test_rate_difference

numeric, difference in rate of A-B to test. Note: it is best to specify this prior to looking at the data.

common_rate

rate numeric, assumed null-rate.

Value

Bernoulli difference test statistic.

Details

Note the intent is that we are measuring the results of an A/B test with max(nA, nB) %% min(nA, nB) == 0 (no padding needed), or max(nA,nB) >> min(nA,nB) (padding is small effect).

The idea of converting a rate problem into a counting problem follows from reading Wald's Sequential Analysis.

For very small p-values the calculation is sensitive to rounding in the observed ratio-difference, as an arbitrarily small change in test-rate can move an entire set of observed differences in or out of the significance calculation.

Examples

Bernoulli_diff_stat(2000, 5000, 100, 200)
#> [1] "Bernoulli difference test: (A=2000/5000=0.4, B=100/200=0.5, =0.4038, post 0.1 two sided; p=0.004677)."
Bernoulli_diff_stat(2000, 5000, 100, 200, 0.1)
#> [1] "Bernoulli difference test: (A=2000/5000=0.4, B=100/200=0.5, =0.4038, prior 0.1 two sided; p=0.004677)."
Bernoulli_diff_stat(2000, 5000, 100, 199)
#> [1] "Bernoulli difference test: (A=2000/5000=0.4, B=100/199=0.5025, =0.4039, post 0.1025 two sided; pL=0.003753, pH=0.00382)."
Bernoulli_diff_stat(2000, 5000, 100, 199, 0.1)
#> [1] "Bernoulli difference test: (A=2000/5000=0.4, B=100/199=0.5025, =0.4039, prior 0.1 two sided; pL=0.004701, pH=0.00474)."
Bernoulli_diff_stat(100, 200, 2000, 5000)
#> [1] "Bernoulli difference test: (A=100/200=0.5, B=2000/5000=0.4, =0.4038, post 0.1 two sided; p=0.004677)."
# sigr adjusts experiment sizes when lengths # don't divide into each other. Bernoulli_diff_stat(100, 199, 2000, 5000)
#> [1] "Bernoulli difference test: (A=100/199=0.5025, B=2000/5000=0.4, =0.4039, post 0.1025 two sided; pL=0.003753, pH=0.00382)."
Bernoulli_diff_stat(100, 199, 2000, 5000)$pValue
#> [1] 0.003786713