两个比例的比较：参数方法（Z

您所在的位置：网站首页 › excel怎么用公式求百分比和百分比的区别 › 两个比例的比较：参数方法（Z

两个比例的比较：参数方法（Z

2024-07-10 20:26| 来源: 网络整理| 查看: 265

Source: http://www.r-bloggers.com/comparison-of-two-proportions-parametric-z-test-and-non-parametric-chi-squared-methods/

考虑以下问题的例子。赌博公司所有人想验证用户是否在欺诈。为此他想比较某个玩家的成功次数和某个雇员的成功次数，从而确定其是否欺骗。在一个月时间内，玩家进行了74次赌博并赢了30次；在相同时期内，该雇员玩了103次赌博，而赢了65次。你的客户是个骗子吗？

这样的问题可以用两种不同的方法来解决：使用参数方法和非参数方法。

*参数方法的解决方案：Z-test

如果你能做如下两个假设，你就可以使用Z-test：成功的概率接近0.5；进行赌博的次数很高（在这两个条件下，二项分布就接近与Gaussian分布）。假设这些条件成立。在R中没有函数计算Z的值，因此我们记起了数学公式，然后创建相应的函数：

$$Z=\frac{\frac{x_1}{n_1}-\frac{x_2}{n_s}}{\sqrt{\widehat{p}(1-\widehat{p})(\frac{1}{n_1}+\frac{1}{n_2})}}$$

z.prop = function(x1,x2,n1,n2){ numerator = (x1/n1) - (x2/n2) p.common = (x1+x2) / (n1+n2) denominator = sqrt(p.common * (1-p.common) * (1/n1 + 1/n2)) z.prop.ris = numerator / denominator return(z.prop.ris) }

Z.prop函数计算Z的值，输入参数为成功数（x1和x2）和赌博的总次数（n1和n2）。我们将问题中的数据代入这个函数：

z.prop(30, 65, 74, 103) [1] -2.969695

我们获得的z值大于查表得到的z值，这样我们论断董事所关注的玩家确实是个骗子，因为其成功的概率高于非欺骗的用户。

=====================================================

* 非参数方法的解决方案：Chi-squared test

如果现在不能对问题数据做任何的假设，那么就不能将二项分布近似为Gauss分布。我们就用chi-square test来解决这样的问题，这里会应用2X2的列联表（contingency table）。在R中有函数prop.test：

prop.test(x = c(30, 65), n = c(74, 103), correct = FALSE) 2-sample test for equality of proportions without continuity correction data: c(30, 65) out of c(74, 103) X-squared = 8.8191, df = 1, p-value = 0.002981 alternative hypothesis: two.sided 95 percent confidence interval: -0.37125315 -0.08007196 sample estimates: prop 1 prop 2 0.4054054 0.6310680

prop.test函数计算chi-square值，输入参数为成功值（向量x中）和总努力数（向量n中）。向量x和n也能预先申明，然后调用函数：prop.test(x,n,correct=FALSE)。

在小样本的情况（n值比较小），你必须指定correct=TRUE，从而改变chi-square基于continuity of Yates来计算：

prop.test(x = c(30, 65), n = c(74, 103), correct=TRUE) 2-sample test for equality of proportions with continuity correction data: c(30, 65) out of c(74, 103) X-squared = 7.9349, df = 1, p-value = 0.004849 alternative hypothesis: two.sided 95 percent confidence interval: -0.38286428 -0.06846083 sample estimates: prop 1 prop 2 0.4054054 0.6310680

在以上两种情况下，我们获得的p-value都小于0.05，于是我们拒绝相等的假设，所以该用户是个骗子。为了确认，我们比较计算的chi-square值和查表的chi-square值，在R中的计算为：

qchisq(0.950, 1) [1] 3.841459

函数qchisq计算输入alpha和自由度后得到的chi-square值。因为之前计算的chi-square大于这里计算的查表chi-square，我们论断拒绝null hypothesis H0。

【本文地址】

两个比例的比较：参数方法（Z

两个比例的比较：参数方法（Z

今日新闻

推荐新闻