1
我想根據員工記錄部門,當我執行load命令將數據插入我的分區表,它是基於我的分區值改變所有部門的值進行蜂巢靜態分區我在負載數據命令中指定。蜂巢靜態分區加載的所有記錄
我在HDFS數據看起來像:
1,adi,Admin,50000,A
2,Gokul,Admin,50000,B
3,Janet,Sales,60000,A
4,Hari,Admin,50000,C
5,Sanker,Admin,50000,C
6,Margaret,Tech,12000,A
7,Nirmal,Tech,12000,B
8,jinju,Engineer,45000,B
9,Nancy,Admin,50000,A
10,Andrew,Manager,40000,A
11,Arun,Manager,40000,B
12,Harish,Sales,60000,B
13,Robert,Manager,40000,A
14,Laura,Engineer,45000,A
15,Anju,Ceo,100000,B
16,Aarathi,Manager,40000,B
17,Parvathy,Engineer,45000,B
18,Gopika,Admin,50000,B
19,Steven,Engineer,45000,A
20,Michael,Ceo,100000,A
我的分區表如下所示:
create table employee(
id string,
name string,
role string,
salary string)
partitioned by (dept string)
row format delimited
fields terminated by ','
lines terminated by '\n'
stored as textfile;
load command
load data inpath '/user/adithyan/employee.txt' overwrite into table employee partition (dept='A');
我所提供的輸入數據之上和執行該命令的記錄已被從HDFS插入後但是它裝載了由'A'更改的所有記錄
輸出:
employee.id employee.name employee.degree employee.salary employee.dept
1 adi Admin 50000 A
2 Gokul Admin 50000 A
3 Janet Sales 60000 A
4 Hari Admin 50000 A
5 Sanker Admin 50000 A
6 Margaret Tech 12000 A
7 Nirmal Tech 12000 A
8 jinju Engineer 45000 A
9 Nancy Admin 50000 A
10 Andrew Manager 40000 A
11 Arun Manager 40000 A
12 Harish Sales 60000 A
13 Robert Manager 40000 A
14 Laura Engineer 45000 A
15 Anju Ceo 100000 A
16 Aarathi Manager 40000 A
17 Parvathy Engineer 45000 A
18 Gopika Admin 50000 A
19 Steven Engineer 45000 A
20 Michael Ceo 100000 A
所有的部門已被更改爲這是不對的。 有人可以幫我如何分區的數據插入到我的表?