多元线性回归

您所在的位置:网站首页 eviews怎么判断异方差 多元线性回归

多元线性回归

2024-07-09 19:10| 来源: 网络整理| 查看: 265

多元线性回归—异方差(二) 异方差检验问题®

文章目录 多元线性回归—异方差(二)异方差检验问题(R)1 异方差及来源2 异方差识别与解决3 异方差识别及处理(R)

1 异方差及来源

在经典OLS估计中,需要假定扰动项为同方差,即 ε ∼ N ( 0 , σ 2 ) \varepsilon \sim N(0,\sigma^2) ε∼N(0,σ2),及扰动项方差为大于0的常数,不随其他自变量的取值而发生改变。但实际样本往往存在异方差问题,随着自变量取值不同,其方差也会发生改变。如果存在异方差,估计量无偏性不会发生变化,但不再是方差最小,相对OLS假定意味着方差增大,则统计量t值将降低,造成系数不显著。从异方差来源看,包括

截面型数据结构,不同截面的性质特征重要解释变量忽略,导致扰动项保护遗漏变量信息,波动性更大模型设定错误,非线性模型设定为线性模型经济结构变动,存在结构性断点问题其他经济原因导致 2 异方差识别与解决

识别异方差的方法有很多,从经验上看,如果是一元线性回归,通过自变量和因变量的散点图可大致识别—如果被解释变量随着自变量增大其波动幅度发生改变。对于多元线性回归,可以将残差平方项和自变量分别画散点图,观察散点图走势,但有时候不好判断。

在理论上,可以通过相关系数检验、white检验、Park检验以及BP检验等。这里介绍相关系数检验、white与BP检验

相关系数检验,将残差绝对值与自变量作相关系数矩阵,哪个自变量与残差绝对值相关系数越高,极可能扰动项方差是该自变量的函数,具体什么形式函数,不知道white检验,将残差平方项对所有自变量,交互项、平方项作回归,哪个自变量显著,极可能意味着扰动项方差是关于该自变量的函数,具体形式不知道BP检验,类似于white检验,但不包括交互项、平方项,哪个自变量显著,极可能意味着扰动项方差是关于该自变量的函数,具体形式不知道 3 异方差识别及处理® # R语言与异方差 # 内存清理 rm(list=ls()) # 导入数据 library(readxl) HET |t|) # (Intercept) -82.22311 111.58700 -0.737 0.47537 # x1 0.08479 0.03299 2.571 0.02452 * # x2 2.66835 0.77964 3.423 0.00505 ** # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 243.6 on 12 degrees of freedom # Multiple R-squared: 0.7292, Adjusted R-squared: 0.6841 # F-statistic: 16.16 on 2 and 12 DF, p-value: 0.000394

由于变量x1,x2系数不大显著,考虑存在异方差问题。先进性识别

方法1,相关系数检验

# 异方差检验 mydata$r = abs(OLS1$residuals) # 自变量与残差绝对值相关系数 cor(mydata[,-1])

相关系数矩阵

# x1 x2 r # x1 1.00000000 0.4399801 0.02981016 # x2 0.43998014 1.0000000 0.82297373 # r 0.02981016 0.8229737 1.00000000

其中x2与残差绝对值相关系数最高,0.823

方法2,white检验方法

# white检验:将残差平方项对所有自变量平方及交互项作回归 # 平方项生成 mydata$x12 = mydata$x1^2;mydata$x22 = mydata$x2^2 # 辅助回归 OLS_white = lm(r~1+x1*x2+x12+x22,data = mydata) summary(OLS_white) OLS_white = lm(r~1+x1*x2,data = mydata) summary(OLS_white)

结果如下

# Call: # lm(formula = r ~ 1 + x1 * x2 + x12 + x22, data = mydata) # # Residuals: # Min 1Q Median 3Q Max # -80.30 -34.97 15.39 25.53 103.82 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # (Intercept) -2.379e+01 9.860e+01 -0.241 0.8147 # x1 9.843e-02 4.953e-02 1.987 0.0782 . # x2 2.900e-01 2.004e+00 0.145 0.8881 # x12 -6.122e-06 4.911e-06 -1.247 0.2440 # x22 5.564e-03 5.864e-03 0.949 0.3675 # x1:x2 -4.335e-04 1.485e-04 -2.919 0.0171 * # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 62.78 on 9 degrees of freedom # Multiple R-squared: 0.9132, Adjusted R-squared: 0.865 # F-statistic: 18.94 on 5 and 9 DF, p-value: 0.0001548 #-------------------------x12,x22及不显著,去掉这两个变量---------------- # Call: # lm(formula = r ~ 1 + x1 * x2, data = mydata) # # Residuals: # Min 1Q Median 3Q Max # -109.238 -22.457 -1.769 24.784 111.056 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # (Intercept) -9.983e+01 3.402e+01 -2.934 0.0136 * # x1 3.926e-02 2.622e-02 1.498 0.1623 # x2 2.202e+00 2.362e-01 9.323 1.48e-06 *** # x1:x2 -3.719e-04 1.294e-04 -2.875 0.0151 * # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 62.78 on 11 degrees of freedom # Multiple R-squared: 0.8939, Adjusted R-squared: 0.865 # F-statistic: 30.9 on 3 and 11 DF, p-value: 1.178e-05

white检验表明x2与方差存在较高相关性。x2是导致异方差的主要来源·

方法3,BP检验

OLS_white = lm(r~1+x1+x2,data = mydata) summary(OLS_white)

结果如下

# Call: # lm(formula = r ~ 1 + x1 + x2, data = mydata) # # Residuals: # Min 1Q Median 3Q Max # -144.911 -27.066 2.737 40.910 121.507 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # (Intercept) -47.57546 36.44256 -1.305 0.2162 # x1 -0.03204 0.01077 -2.974 0.0116 * # x2 1.84539 0.25462 7.248 1.02e-05 *** # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 79.55 on 12 degrees of freedom # Multiple R-squared: 0.8142, Adjusted R-squared: 0.7832 # F-statistic: 26.29 on 2 and 12 DF, p-value: 4.114e-05

与white检验类似

接下来对异方差进行处理,常用的方法是采用加权最小二乘法。对于残差波动较大的观测值赋予较小的权重,对于残差较小观测赋予较大权重,与处理存在异常值的稳健回归的思想类似。由于x2是异方差主要来源,因此权重设定为x2的幂函数,即 1 / x 2 α 1/x_2^\alpha 1/x2α​,参数 α \alpha α通过模型目的或指标进行挑选。

# 权重设定,例如α=2.5 w = 1/(mydata$x2)^2.5 OLS2 = lm(y~1+x1+x2,weights = w,data = mydata) summary(OLS2) # Call: # lm(formula = y ~ 1 + x1 + x2, data = mydata, weights = w) # # Weighted Residuals: # Min 1Q Median 3Q Max # -0.42144 -0.31103 -0.01519 0.17582 0.81897 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # (Intercept) -59.03487 53.82151 -1.097 0.2942 # x1 0.08145 0.03010 2.706 0.0191 * # x2 2.48872 0.93144 2.672 0.0203 * # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 0.3814 on 12 degrees of freedom # Multiple R-squared: 0.7485, Adjusted R-squared: 0.7065 # F-statistic: 17.85 on 2 and 12 DF, p-value: 0.0002533

OLS与WLS结果对比

# 结果对比 #install.packages("stargazer") library("stargazer") # 默认latex stargazer(OLS1,OLS2) stargazer(OLS1,OLS2,type = "text") Table created by stargazer v.5.2.2 by Marek Hlavac, Harvard University. E-mail: hlavac at fas.harvard.edu % Date and time: 周五, 10月 22, 2021 - 12:57:40 \begin{table}[!htbp] \centering \caption{} \label{} \begin{tabular}{@{\extracolsep{5pt}}lcc} \\[-1.8ex]\hline \hline \\[-1.8ex] & \multicolumn{2}{c}{\textit{Dependent variable:}} \\ \cline{2-3} \\[-1.8ex] & \multicolumn{2}{c}{y} \\ \\[-1.8ex] & (1) & (2)\\ \hline \\[-1.8ex] x1 & 0.085$^{**}$ & 0.081$^{**}$ \\ & (0.033) & (0.030) \\ & & \\ x2 & 2.668$^{***}$ & 2.489$^{**}$ \\ & (0.780) & (0.931) \\ & & \\ Constant & $-$82.223 & $-$59.035 \\ & (111.587) & (53.822) \\ & & \\ \hline \\[-1.8ex] Observations & 15 & 15 \\ R$^{2}$ & 0.729 & 0.748 \\ Adjusted R$^{2}$ & 0.684 & 0.707 \\ Residual Std. Error (df = 12) & 243.593 & 0.381 \\ F Statistic (df = 2; 12) & 16.160$^{***}$ & 17.853$^{***}$ \\ \hline \hline \\[-1.8ex] \textit{Note:} & \multicolumn{2}{r}{$^{*}$p$|t|) (Intercept) -57.98229 55.26792 -1.049 0.3148 x1 0.07909 0.03070 2.576 0.0243 * x2 2.49388 1.00874 2.472 0.0294 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1211 on 12 degrees of freedom Multiple R-squared: 0.7084, Adjusted R-squared: 0.6598 F-statistic: 14.58 on 2 and 12 DF, p-value: 0.0006143 ==========第 6 个回归=============== Call: lm(formula = y ~ 1 + x1 + x2, data = mydata, weights = w) Weighted Residuals: Min 1Q Median 3Q Max -0.051320 -0.023101 -0.001766 0.018498 0.096651 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -56.73416 56.89965 -0.997 0.338 x1 0.07591 0.03107 2.443 0.031 * x2 2.50036 1.07254 2.331 0.038 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.03985 on 12 degrees of freedom Multiple R-squared: 0.6621, Adjusted R-squared: 0.6057 F-statistic: 11.75 on 2 and 12 DF, p-value: 0.00149 ==========第 7 个回归=============== Call: lm(formula = y ~ 1 + x1 + x2, data = mydata, weights = w) Weighted Residuals: Min 1Q Median 3Q Max -0.017860 -0.005890 -0.000502 0.006249 0.033352 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -54.35365 57.82320 -0.940 0.3658 x1 0.07219 0.03108 2.323 0.0386 * x2 2.48902 1.11469 2.233 0.0454 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.01344 on 12 degrees of freedom Multiple R-squared: 0.6168, Adjusted R-squared: 0.5529 F-statistic: 9.656 on 2 and 12 DF, p-value: 0.003169

结果比较

stargazer(result[[1]],result[[2]],result[[3]], result[[4]],result[[5]],result[[6]], result[[7]],type = "text") # ================================================================================================== # Dependent variable: # -------------------------------------------------------------------- # y # (1) (2) (3) (4) (5) (6) (7) # -------------------------------------------------------------------------------------------------- # x1 0.084** 0.084** 0.083** 0.081** 0.079** 0.076** 0.072** # (0.030) (0.030) (0.030) (0.030) (0.031) (0.031) (0.031) # # x2 2.569*** 2.529*** 2.500** 2.489** 2.494** 2.500** 2.489** # (0.763) (0.798) (0.856) (0.931) (1.009) (1.073) (1.115) # # Constant -67.813 -63.687 -60.808 -59.035 -57.982 -56.734 -54.354 # (67.339) (57.937) (54.086) (53.822) (55.268) (56.900) (57.823) # # -------------------------------------------------------------------------------------------------- # Observations 15 15 15 15 15 15 15 # R2 0.786 0.789 0.776 0.748 0.708 0.662 0.617 # Adjusted R2 0.750 0.754 0.739 0.707 0.660 0.606 0.553 # Residual Std. Error (df = 12) 16.225 4.412 1.262 0.381 0.121 0.040 0.013 # F Statistic (df = 2; 12) 22.025*** 22.403*** 20.802*** 17.853*** 14.579*** 11.754*** 9.656*** # ================================================================================================== # Note: *p


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3