Hive中的四种排序方式(order by,sort by,distribute by,cluster by)使用与区别详解

您所在的位置:网站首页 数据表降序排列方法有哪些 Hive中的四种排序方式(order by,sort by,distribute by,cluster by)使用与区别详解

Hive中的四种排序方式(order by,sort by,distribute by,cluster by)使用与区别详解

2024-07-12 23:37| 来源: 网络整理| 查看: 265

在平时的Hive数仓开发工作中经常会用到排序,而Hive中支持的排序方式有四种,这里结合具体的案例详细介绍一下他们的使用与区别:

order by sort by distribute by cluster by

准备工作:

新建一个测试用表employInfo: create table employInfo(deptID int,employID int,employName string,employSalary double) row format delimited fields terminated by ','; 向测试用表中导入测试数据: load data local inpath '/home/hadoop/datas/employInfo.txt' into table employInfo;

以下为测试用的数据:

[hadoop@weekend110 datas]$ cat employInfo.txt deptID,employID,employName,employSalary 1,1001,Jack01,5000 1,1002,Jack02,5001 1,1003,Jack03,5002 1,1004,Jack04,5003 1,1005,Jack05,5004 1,1006,Jack06,5005 1,1007,Jack07,5006 1,1008,Jack08,5007 1,1009,Jack09,5008 1,1010,Jack10,5009 1,1011,Jack11,5010 1,1012,Jack12,5011 2,1013,Maria01,7500 2,1014,Maria02,7501 2,1015,Maria03,7502 2,1016,Maria04,7503 2,1017,Maria05,7504 2,1018,Maria06,7505 2,1019,Maria07,7506 2,1020,Maria08,7507 2,1021,Maria09,7508 3,1022,Lucy01,8540 3,1023,Lucy02,8541 3,1024,Lucy03,8542 3,1025,Lucy04,8543 3,1026,Lucy05,8544 3,1027,Lucy06,8545 3,1028,Lucy07,8546 3,1029,Lucy08,8547 3,1030,Lucy09,8548 3,1031,Lucy10,8549 3,1032,Lucy11,8550 3,1033,Lucy12,8551 4,1034,Jimmy01,10000 4,1035,Jimmy02,10001 4,1036,Jimmy03,10002 4,1037,Jimmy04,10003 4,1038,Jimmy05,10004 4,1039,Jimmy06,10005 4,1040,


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3