STATA学习笔记:分组统计和分组回归 |
您所在的位置:网站首页 › 统计分组怎么做数据分析 › STATA学习笔记:分组统计和分组回归 |
STATA学习笔记:分组统计和分组回归
1. 分组统计
(1)对一个类别变量进行统计时
tabulate命令
tabulate oneway
//for one-way tables of frequencies
tabulate twoway
//for two-way tables of frequencies
graph bar命令
graph bar yvars [if] [in] [weight] [, options]
*graph bar draws vertical bar charts. *In a vertical bar chart, the y axis is numerical, *and the x axis is categorical. graph hbar yvars [if] [in] [weight] [, options]*graph hbar draws horizontal bar charts. *In a horizontal bar chart, the numerical axis is still called the y axis, *and the categorical axis is still *called the x axis, but y is presented horizontally, and x vertically. sysuse nlsw88.dta, clear tabulate industry
Description tabplot plots a table of numerical values (for example, frequencies, fractions, or percents) in graphical form as a bar chart. It is mainly intended for representing contingency tables for one, two, or three categorical variables. It also has uses for producing multiple histograms and graphs for general one-, two-, or three-way tables. tabplot industry collgrad选项showval showval specifies that numeric values be shown beneath (or if horizontal is specified, to the left of) bars. tabplot, showval(offset(#)) //specifies an offset between the base (or left-hand edge) of the bar //and the position of the numeric value. Default is 0.1 //with two variables or 0.02 with one variable. tabplot, percent()选项percent percent(varlist) indicates that all frequencies should be shown as percents (with sum 100) of the total frequency for each distinct category defined by the combinations of varlist. Usually, varlist will be one or more of the variables specified. tabplot industry collgrad, horizontal /// percent(collgrad) showval(offset(0.05))
tabstat命令 sysuse auto,clear tabstat price weight mpg rep78 //Show the mean (by default) of price, weight, mpg, and rep78Description quantiles creates a new variable (newvar) that categorizes varname by its quantiles. It differs from xtile because the categories are defined by the ideal size of the quantile rather than by the cutpoints, therefore yielding less unequaly sized categories when the cutpoint value is frequent, when using weights or when the number of observations in the dataset is not a product of the number of quantiles. It may not work properly on small datasets or if calculated for small groups. If the number of observations in the dataset or group is small, tabulate newvar afterwards to check the results. quantiles "var", gen "new_varname" //gencatvar(newvar) specifies the new variable to be generated quantiles var,nquant() // specifies the number of quantiles; # must be an integer and greater than or equal to 2. If not specified default is 10 quantiles bysort industry: quantiles wage, gen(p_wage) nq(3) //按行业分组之后,把每个行业内部的wage按33th,66th分成3组 //每一个行业对应一个类别变量p_wage,有三个值:1,2,3
|
今日新闻 |
推荐新闻 |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |