【Python计量】linearmodels面板数据回归(二)

您所在的位置:网站首页 固定效应模型实例 【Python计量】linearmodels面板数据回归(二)

【Python计量】linearmodels面板数据回归(二)

2024-07-10 07:09| 来源: 网络整理| 查看: 265

文章目录 一、导入相关库二、获取面板数据三、个体固定效应(一)PanelOLS(二)smf.ols 四、时间固定效应(一)PanelOLS(二)smf.ols 五、个体固定效应+时间固定效应(一)PanelOLS(二)smf.ols 在本文,我将使用Grunfeld数据集(可在statsmodels.datasets中获得)来演示固定效应模型的使用。

该数据包含11家公司中每家20年的数据:IBM,通用电气,美国钢铁,大西洋炼油,钻石比赛,西屋电气,通用汽车,固特异,克莱斯勒,联合石油和美国钢铁。

模型如下:

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + a i + ϕ t + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+a_{i}+\phi_t+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+ai​+ϕt​+uit​

其中单个公司因子为 a i a_i ai​或称为entity_effects。 时间因子是 ϕ t \phi_t ϕt​或称为time_effects。

如下所示,其中 D j D_j Dj​是公司i的虚拟变量,而 I t I_t It​是t年的虚拟变量。

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + θ j ∑ i = 1 N − 1 D j + v t ∑ t = 1 h − 1 I t + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+\theta_j\sum_{i=1}^{N-1}D_j+v_t\sum_{t=1}^{h-1}I_t+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+θj​i=1∑N−1​Dj​+vt​t=1∑h−1​It​+uit​

一、导入相关库 from statsmodels.datasets import grunfeld from linearmodels.panel import PanelOLS import pandas as pd import statsmodels.formula.api as smf 二、获取面板数据 data = grunfeld.load_pandas().data #设置索引 data = data.set_index(["firm","year"],drop=False) 三、个体固定效应

模型如下:

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + a i + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+a_{i}+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+ai​+uit​

其中单个公司因子为 a i a_i ai​或称为entity_effects。

如下所示,其中 D j D_j Dj​是公司i的虚拟变量。

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + θ j ∑ i = 1 N − 1 D j + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+\theta_j\sum_{i=1}^{N-1}D_j+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+θj​i=1∑N−1​Dj​+uit​

(一)PanelOLS #个体固定效应:基于数组 exog = data[['value','capital']] res_fe = PanelOLS(data['invest'], exog, entity_effects=True) results_fe = res_fe.fit() print(results_fe) #个体固定效应:基于公式 res_fe = PanelOLS.from_formula('invest ~ value + capital + EntityEffects', data=data) results_fe = res_fe.fit() print(results_fe)

基于数组和基于公式的返回结果一致,如下所示:

PanelOLS Estimation Summary ================================================================================ Dep. Variable: invest R-squared: 0.7667 Estimator: PanelOLS R-squared (Between): 0.8223 No. Observations: 220 R-squared (Within): 0.7667 Date: Wed, Jul 20 2022 R-squared (Overall): 0.8132 Time: 15:55:39 Log-likelihood -1167.4 Cov. Estimator: Unadjusted F-statistic: 340.08 Entities: 11 P-value 0.0000 Avg Obs: 20.000 Distribution: F(2,207) Min Obs: 20.000 Max Obs: 20.000 F-statistic (robust): 340.08 P-value 0.0000 Time periods: 20 Distribution: F(2,207) Avg Obs: 11.000 Min Obs: 11.000 Max Obs: 11.000 Parameter Estimates ============================================================================== Parameter Std. Err. T-stat P-value Lower CI Upper CI ------------------------------------------------------------------------------ capital 0.3100 0.0165 18.744 0.0000 0.2774 0.3426 value 0.1101 0.0113 9.7461 0.0000 0.0879 0.1324 ============================================================================== F-test for Poolability: 49.207 P-value: 0.0000 Distribution: F(10,207) Included effects: Entity (二)smf.ols #采用ols估计,加入个体的虚拟变量 res_ols = smf.ols('invest ~ value + capital +firm', data=data) #res_ols = smf.ols('invest ~ value + capital + C(firm)', data=data) results_ols = res_ols.fit() print(results_ols.summary())

结果如下:

OLS Regression Results ============================================================================== Dep. Variable: invest R-squared: 0.946 Model: OLS Adj. R-squared: 0.943 Method: Least Squares F-statistic: 302.6 Date: Wed, 20 Jul 2022 Prob (F-statistic): 4.77e-124 Time: 17:33:36 Log-Likelihood: -1167.4 No. Observations: 220 AIC: 2361. Df Residuals: 207 BIC: 2405. Df Model: 12 Covariance Type: nonrobust ============================================================================================= coef std err t P>|t| [0.025 0.975] --------------------------------------------------------------------------------------------- Intercept -20.5782 11.298 -1.821 0.070 -42.852 1.695 firm[T.Atlantic Refining] -94.0243 17.164 -5.478 0.000 -127.862 -60.186 firm[T.Chrysler] -7.2309 17.338 -0.417 0.677 -41.413 26.951 firm[T.Diamond Match] 14.0102 15.944 0.879 0.381 -17.422 45.443 firm[T.General Electric] -214.9912 25.461 -8.444 0.000 -265.188 -164.795 firm[T.General Motors] -49.7209 48.280 -1.030 0.304 -144.905 45.463 firm[T.Goodyear] -66.6363 16.379 -4.068 0.000 -98.927 -34.346 firm[T.IBM] -2.5820 16.379 -0.158 0.875 -34.873 29.709 firm[T.US Steel] 122.4829 25.960 4.718 0.000 71.304 173.662 firm[T.Union Oil] -45.9660 16.357 -2.810 0.005 -78.215 -13.717 firm[T.Westinghouse] -36.9683 17.309 -2.136 0.034 -71.093 -2.843 value 0.1101 0.011 9.746 0.000 0.088 0.132 capital 0.3100 0.017 18.744 0.000 0.277 0.343 ============================================================================== Omnibus: 35.893 Durbin-Watson: 1.079 Prob(Omnibus): 0.000 Jarque-Bera (JB): 243.455 Skew: 0.297 Prob(JB): 1.36e-53 Kurtosis: 8.119 Cond. No. 2.98e+04 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 2.98e+04. This might indicate that there are strong multicollinearity or other numerical problems.

也可采用去时间均值方法获取。

data = grunfeld.load_pandas().data #设置索引 data = data.set_index(["firm","year"]) #此处drop=True #求被解释变量、解释变量的去除时间均值 data['invest_w'] = data['invest'] - data.groupby('firm').mean()['invest'] data['value_w'] = data['value'] - data.groupby('firm').mean()['value'] data['capital_w'] = data['capital'] - data.groupby('firm').mean()['capital'] #用OLS方程对去除时间均值进行估计 results_man = smf.ols('invest_w ~ 0 + value_w +capital_w', data).fit() print(results_man.summary())

结果如下:

OLS Regression Results ======================================================================================= Dep. Variable: invest_w R-squared (uncentered): 0.767 Model: OLS Adj. R-squared (uncentered): 0.765 Method: Least Squares F-statistic: 358.2 Date: Wed, 20 Jul 2022 Prob (F-statistic): 1.28e-69 Time: 17:58:17 Log-Likelihood: -1167.4 No. Observations: 220 AIC: 2339. Df Residuals: 218 BIC: 2346. Df Model: 2 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ value_w 0.1101 0.011 10.002 0.000 0.088 0.132 capital_w 0.3100 0.016 19.236 0.000 0.278 0.342 ============================================================================== Omnibus: 35.893 Durbin-Watson: 1.079 Prob(Omnibus): 0.000 Jarque-Bera (JB): 243.455 Skew: 0.297 Prob(JB): 1.36e-53 Kurtosis: 8.119 Cond. No. 1.74 ============================================================================== Notes: [1] R² is computed without centering (uncentered) since the model does not contain a constant. [2] Standard Errors assume that the covariance matrix of the errors is correctly specified. 四、时间固定效应

模型如下:

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + ϕ t + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+\phi_t+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+ϕt​+uit​

其中,时间因子是 ϕ t \phi_t ϕt​或称为time_effects。

如下所示,其中 I t I_t It​是t年的虚拟变量。

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + v t ∑ t = 1 h − 1 I t + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+v_t\sum_{t=1}^{h-1}I_t+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+vt​t=1∑h−1​It​+uit​

(一)PanelOLS #时间固定效应:基于数组 exog = data[['value','capital']] res_fe = PanelOLS(data['invest'], exog, time_effects=True) results_fe = res_fe.fit() print(results_fe) #时间固定效应:基于公式 res_fe = PanelOLS.from_formula('invest ~ value + capital + TimeEffects', data=data) results_fe = res_fe.fit() print(results_fe)

基于数组和基于公式的返回结果一致,如下所示:

PanelOLS Estimation Summary ================================================================================ Dep. Variable: invest R-squared: 0.8109 Estimator: PanelOLS R-squared (Between): 0.8720 No. Observations: 220 R-squared (Within): 0.7273 Date: Wed, Jul 20 2022 R-squared (Overall): 0.8481 Time: 17:40:21 Log-likelihood -1298.8 Cov. Estimator: Unadjusted F-statistic: 424.46 Entities: 11 P-value 0.0000 Avg Obs: 20.000 Distribution: F(2,198) Min Obs: 20.000 Max Obs: 20.000 F-statistic (robust): 424.46 P-value 0.0000 Time periods: 20 Distribution: F(2,198) Avg Obs: 11.000 Min Obs: 11.000 Max Obs: 11.000 Parameter Estimates ============================================================================== Parameter Std. Err. T-stat P-value Lower CI Upper CI ------------------------------------------------------------------------------ capital 0.2166 0.0299 7.2436 0.0000 0.1577 0.2756 value 0.1158 0.0060 19.434 0.0000 0.1040 0.1275 ============================================================================== F-test for Poolability: 0.2419 P-value: 0.9996 Distribution: F(19,198) Included effects: Time (二)smf.ols #采用ols估计,加入个体的虚拟变量 res_ols = smf.ols('invest ~ value + capital + C(year)', data=data) results_ols = res_ols.fit() print(results_ols.summary())

结果如下:

OLS Regression Results ============================================================================== Dep. Variable: invest R-squared: 0.822 Model: OLS Adj. R-squared: 0.803 Method: Least Squares F-statistic: 43.55 Date: Wed, 20 Jul 2022 Prob (F-statistic): 1.27e-62 Time: 17:41:37 Log-Likelihood: -1298.8 No. Observations: 220 AIC: 2642. Df Residuals: 198 BIC: 2716. Df Model: 21 Covariance Type: nonrobust ===================================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------------- Intercept -21.6815 28.354 -0.765 0.445 -77.597 34.234 C(year)[T.1936.0] -15.1865 39.884 -0.381 0.704 -93.839 63.466 C(year)[T.1937.0] -30.8415 39.958 -0.772 0.441 -109.640 47.957 C(year)[T.1938.0] -25.9640 39.882 -0.651 0.516 -104.611 52.683 C(year)[T.1939.0] -51.2476 39.902 -1.284 0.201 -129.936 27.441 C(year)[T.1940.0] -27.5208 39.911 -0.690 0.491 -106.226 51.184 C(year)[T.1941.0] -2.0012 39.928 -0.050 0.960 -80.739 76.737 C(year)[T.1942.0] -0.3563 39.990 -0.009 0.993 -79.216 78.504 C(year)[T.1943.0] -18.7958 39.997 -0.470 0.639 -97.671 60.079 C(year)[T.1944.0] -19.4973 39.991 -0.488 0.626 -98.360 59.366 C(year)[T.1945.0] -29.7423 40.002 -0.744 0.458 -108.627 49.142 C(year)[T.1946.0] -6.1207 40.033 -0.153 0.879 -85.066 72.825 C(year)[T.1947.0] -4.3649 40.312 -0.108 0.914 -83.860 75.130 C(year)[T.1948.0] -2.8025 40.508 -0.069 0.945 -82.686 77.081 C(year)[T.1949.0] -25.2951 40.683 -0.622 0.535 -105.522 54.932 C(year)[T.1950.0] -24.9390 40.767 -0.612 0.541 -105.332 55.454 C(year)[T.1951.0] -9.4694 40.792 -0.232 0.817 -89.912 70.973 C(year)[T.1952.0] -3.8273 41.134 -0.093 0.926 -84.944 77.289 C(year)[T.1953.0] 4.0537 41.589 0.097 0.922 -77.961 86.068 C(year)[T.1954.0] -9.3916 42.268 -0.222 0.824 -92.744 73.961 value 0.1158 0.006 19.434 0.000 0.104 0.128 capital 0.2166 0.030 7.244 0.000 0.158 0.276 ============================================================================== Omnibus: 33.290 Durbin-Watson: 0.341 Prob(Omnibus): 0.000 Jarque-Bera (JB): 134.793 Skew: 0.482 Prob(JB): 5.37e-30 Kurtosis: 6.711 Cond. No. 3.42e+04 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 3.42e+04. This might indicate that there are strong multicollinearity or other numerical problems. 五、个体固定效应+时间固定效应

模型如下:

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + a i + ϕ t + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+a_{i}+\phi_t+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+ai​+ϕt​+uit​

其中单个公司因子为 a i a_i ai​或称为entity_effects。 时间因子是 ϕ t \phi_t ϕt​或称为time_effects。

如下所示,其中 D j D_j Dj​是公司i的虚拟变量,而 I t I_t It​是t年的虚拟变量。

i n v e s t i t = β 0 + β 1 v a l u e i t + β 2 c a p i t a l i t + θ j ∑ i = 1 N − 1 D j + v t ∑ t = 1 h − 1 I t + u i t invest_{it}=\beta_0+\beta_1value_{it}+\beta_2capital_{it}+\theta_j\sum_{i=1}^{N-1}D_j+v_t\sum_{t=1}^{h-1}I_t+u_{it} investit​=β0​+β1​valueit​+β2​capitalit​+θj​i=1∑N−1​Dj​+vt​t=1∑h−1​It​+uit​

(一)PanelOLS #个体固定效应+时间固定效应:基于数组 exog = data[['value','capital']] res_fe = PanelOLS(data['invest'], exog, entity_effects=True,time_effects=True) results_fe = res_fe.fit() print(results_fe) #个体固定效应+时间固定效应:基于公式 res_fe = PanelOLS.from_formula('invest ~ value + capital + EntityEffects + TimeEffects', data=data) results_fe = res_fe.fit() print(results_fe)

基于数组和基于公式的返回结果一致,如下所示:

PanelOLS Estimation Summary ================================================================================ Dep. Variable: invest R-squared: 0.7253 Estimator: PanelOLS R-squared (Between): 0.7637 No. Observations: 220 R-squared (Within): 0.7566 Date: Wed, Jul 20 2022 R-squared (Overall): 0.7625 Time: 17:46:42 Log-likelihood -1153.0 Cov. Estimator: Unadjusted F-statistic: 248.15 Entities: 11 P-value 0.0000 Avg Obs: 20.000 Distribution: F(2,188) Min Obs: 20.000 Max Obs: 20.000 F-statistic (robust): 248.15 P-value 0.0000 Time periods: 20 Distribution: F(2,188) Avg Obs: 11.000 Min Obs: 11.000 Max Obs: 11.000 Parameter Estimates ============================================================================== Parameter Std. Err. T-stat P-value Lower CI Upper CI ------------------------------------------------------------------------------ capital 0.3514 0.0210 16.696 0.0000 0.3099 0.3930 value 0.1167 0.0129 9.0219 0.0000 0.0912 0.1422 ============================================================================== F-test for Poolability: 18.476 P-value: 0.0000 Distribution: F(29,188) Included effects: Entity, Time

也可写成这样的代码:

#个体固定效应+时间固定效应:基于数组 exog = data[['value','capital','firm']] res_fe = PanelOLS(data['invest'], exog, time_effects=True) #11家公司创建10个虚拟变量 results_fe = res_fe.fit() print(results_fe) #个体固定效应+时间固定效应:基于数组 year = pd.Categorical(data.year) #将数字形式的年份转化为类别形式 data['year'] = year exog = data[['value','capital','year']] res_fe = PanelOLS(data['invest'], exog, entity_effects=True) #20年创建19个虚拟变量 results_fe = res_fe.fit() results_fe = res_fe.fit() print(results_fe) #个体固定效应+时间固定效应:基于公式( + 个体虚拟变量 + TimeEffects) res_fe = PanelOLS.from_formula('invest ~ value + capital + firm + TimeEffects', data=data) #不足之处:11家公司创建11个虚拟变量 results_fe = res_fe.fit() print(results_fe) #个体固定效应+时间固定效应:基于公式( + EntityEffects + 时间虚拟变量) res_fe = PanelOLS.from_formula('invest ~ value + capital + EntityEffects + C(year)', data=data) #不足之处:20年创建20个虚拟变量 results_fe = res_fe.fit() print(results_fe) (二)smf.ols #采用ols估计,加入个体和时间的虚拟变量 res_ols = smf.ols('invest ~ value + capital + firm + C(year)', data=data) results_ols = res_ols.fit() print(results_ols.summary())

结果如下:

OLS Regression Results ============================================================================== Dep. Variable: invest R-squared: 0.953 Model: OLS Adj. R-squared: 0.945 Method: Least Squares F-statistic: 122.1 Date: Wed, 20 Jul 2022 Prob (F-statistic): 5.20e-108 Time: 17:47:55 Log-Likelihood: -1153.0 No. Observations: 220 AIC: 2370. Df Residuals: 188 BIC: 2479. Df Model: 31 Covariance Type: nonrobust ============================================================================================= coef std err t P>|t| [0.025 0.975] --------------------------------------------------------------------------------------------- Intercept 18.0876 18.656 0.970 0.334 -18.715 54.890 firm[T.Atlantic Refining] -112.5008 17.752 -6.337 0.000 -147.520 -77.482 firm[T.Chrysler] -13.5993 17.540 -0.775 0.439 -48.199 21.001 firm[T.Diamond Match] 16.4928 15.692 1.051 0.295 -14.462 47.448 firm[T.General Electric] -241.0850 28.000 -8.610 0.000 -296.319 -185.851 firm[T.General Motors] -101.7696 55.177 -1.844 0.067 -210.615 7.075 firm[T.Goodyear] -77.9628 16.435 -4.744 0.000 -110.383 -45.543 firm[T.IBM] -6.4573 16.271 -0.397 0.692 -38.554 25.640 firm[T.US Steel] 100.5492 28.438 3.536 0.001 44.450 156.648 firm[T.Union Oil] -56.7936 16.403 -3.462 0.001 -89.151 -24.436 firm[T.Westinghouse] -41.7165 17.483 -2.386 0.018 -76.204 -7.229 C(year)[T.1936.0] -16.9592 21.518 -0.788 0.432 -59.407 25.488 C(year)[T.1937.0] -36.3756 22.364 -1.627 0.106 -80.492 7.741 C(year)[T.1938.0] -35.6237 21.162 -1.683 0.094 -77.370 6.122 C(year)[T.1939.0] -63.0994 21.505 -2.934 0.004 -105.522 -20.677 C(year)[T.1940.0] -39.8248 21.626 -1.842 0.067 -82.486 2.836 C(year)[T.1941.0] -16.4878 21.529 -0.766 0.445 -58.957 25.982 C(year)[T.1942.0] -17.9993 21.275 -0.846 0.399 -59.967 23.968 C(year)[T.1943.0] -37.7724 21.415 -1.764 0.079 -80.016 4.471 C(year)[T.1944.0] -38.3201 21.459 -1.786 0.076 -80.652 4.012 C(year)[T.1945.0] -49.5395 21.687 -2.284 0.023 -92.322 -6.757 C(year)[T.1946.0] -27.7544 21.866 -1.269 0.206 -70.888 15.379 C(year)[T.1947.0] -34.8775 21.589 -1.616 0.108 -77.464 7.709 C(year)[T.1948.0] -38.3307 21.734 -1.764 0.079 -81.204 4.542 C(year)[T.1949.0] -65.2008 21.901 -2.977 0.003 -108.404 -21.998 C(year)[T.1950.0] -67.3877 22.028 -3.059 0.003 -110.841 -23.935 C(year)[T.1951.0] -54.8346 22.437 -2.444 0.015 -99.095 -10.574 C(year)[T.1952.0] -56.4890 22.819 -2.475 0.014 -101.504 -11.474 C(year)[T.1953.0] -58.5126 23.819 -2.457 0.015 -105.500 -11.525 C(year)[T.1954.0] -81.7939 24.204 -3.379 0.001 -129.540 -34.047 value 0.1167 0.013 9.022 0.000 0.091 0.142 capital 0.3514 0.021 16.696 0.000 0.310 0.393 ============================================================================== Omnibus: 32.466 Durbin-Watson: 0.988 Prob(Omnibus): 0.000 Jarque-Bera (JB): 180.276 Skew: 0.311 Prob(JB): 7.14e-40 Kurtosis: 7.391 Cond. No. 3.92e+04 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 3.92e+04. This might indicate that there are strong multicollinearity or other numerical problems.

欢迎关注: 微信公众号 Python for Finance



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3