2017-03-04 72 views
1

我想根據員工記錄部門,當我執行load命令將數據插入我的分區表,它是基於我的分區值改變所有部門的值進行蜂巢靜態分區我在負載數據命令中指定。蜂巢靜態分區加載的所有記錄

我在HDFS數據看起來像:

1,adi,Admin,50000,A 
2,Gokul,Admin,50000,B 
3,Janet,Sales,60000,A 
4,Hari,Admin,50000,C 
5,Sanker,Admin,50000,C 
6,Margaret,Tech,12000,A 
7,Nirmal,Tech,12000,B 
8,jinju,Engineer,45000,B 
9,Nancy,Admin,50000,A 
10,Andrew,Manager,40000,A 
11,Arun,Manager,40000,B 
12,Harish,Sales,60000,B 
13,Robert,Manager,40000,A 
14,Laura,Engineer,45000,A 
15,Anju,Ceo,100000,B 
16,Aarathi,Manager,40000,B 
17,Parvathy,Engineer,45000,B 
18,Gopika,Admin,50000,B 
19,Steven,Engineer,45000,A 
20,Michael,Ceo,100000,A 

我的分區表如下所示:

create table employee(
id string, 
name string, 
role string, 
salary string) 
partitioned by (dept string) 
row format delimited 
fields terminated by ',' 
lines terminated by '\n' 
stored as textfile; 

load command 
load data inpath '/user/adithyan/employee.txt' overwrite into table employee partition (dept='A'); 

我所提供的輸入數據之上和執行該命令的記錄已被從HDFS插入後但是它裝載了由'A'更改的所有記錄

輸出:

employee.id employee.name employee.degree employee.salary employee.dept 
1 adi Admin 50000 A 
2 Gokul Admin 50000 A 
3 Janet Sales 60000 A 
4 Hari Admin 50000 A 
5 Sanker Admin 50000 A 
6 Margaret Tech 12000 A 
7 Nirmal Tech 12000 A 
8 jinju Engineer 45000 A 
9 Nancy Admin 50000 A 
10 Andrew Manager 40000 A 
11 Arun Manager 40000 A 
12 Harish Sales 60000 A 
13 Robert Manager 40000 A 
14 Laura Engineer 45000 A 
15 Anju Ceo 100000 A 
16 Aarathi Manager 40000 A 
17 Parvathy Engineer 45000 A 
18 Gopika Admin 50000 A 
19 Steven Engineer 45000 A 
20 Michael Ceo 100000 A 

所有的部門已被更改爲這是不對的。 有人可以幫我如何分區的數據插入到我的表?

回答

0

看來加載數據不適合這項工作。

加載操作是目前純粹的複製/移動操作,將 數據文件移動到對應於Hive表的位置。

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Loadingfilesintotables

但是,您可以使用外部表實現自己的目標。

create external table employee_ext 
(
    id  string 
    ,name string 
    ,role string 
    ,salary string 
    ,dept string 
) 
row format delimited 
fields terminated by ',' 
lines terminated by '\n' 
stored as textfile 
location '/user/adithyan/' 
; 

insert into employee partition (dept) select * from employee_ext; 

select * from employee 
; 

+-------------+---------------+---------------+-----------------+---------------+ 
| employee.id | employee.name | employee.role | employee.salary | employee.dept | 
+-------------+---------------+---------------+-----------------+---------------+ 
|   1 | adi   | Admin   |   50000 | A    | 
|   3 | Janet   | Sales   |   60000 | A    | 
|   6 | Margaret  | Tech   |   12000 | A    | 
|   9 | Nancy   | Admin   |   50000 | A    | 
|   10 | Andrew  | Manager  |   40000 | A    | 
|   13 | Robert  | Manager  |   40000 | A    | 
|   14 | Laura   | Engineer  |   45000 | A    | 
|   19 | Steven  | Engineer  |   45000 | A    | 
|   20 | Michael  | Ceo   |   100000 | A    | 
|   2 | Gokul   | Admin   |   50000 | B    | 
|   7 | Nirmal  | Tech   |   12000 | B    | 
|   8 | jinju   | Engineer  |   45000 | B    | 
|   11 | Arun   | Manager  |   40000 | B    | 
|   12 | Harish  | Sales   |   60000 | B    | 
|   15 | Anju   | Ceo   |   100000 | B    | 
|   16 | Aarathi  | Manager  |   40000 | B    | 
|   17 | Parvathy  | Engineer  |   45000 | B    | 
|   18 | Gopika  | Admin   |   50000 | B    | 
|   4 | Hari   | Admin   |   50000 | C    | 
|   5 | Sanker  | Admin   |   50000 | C    | 
+-------------+---------------+---------------+-----------------+---------------+