2016-10-10 43 views
1
child = load 'file_name' using PigStorage('\t') as (child_code : chararray, child_id : int, child_precode_id : int); 
parents = load 'file_name' using PigStorage('\t') as (child_id : int, child_internal_id : chararray, mother_id : int, father_id : int); 
joined = JOIN child by child_id, parents by child_id; 
mainparent = FOREACH joined GENERATE child_id as child_id_source, child_precode_id, child_code; 
store parent into '(location of file)' using PigStorage('\t'); 
childfirst = JOIN mainparent by (child_id_source), parents by (mother_id OR father_id); 
firstgen = FOREACH childfirst GENERATE child_id, child_precode_id, child_code; 
store firstgen into 'file_location' using PigStorage('\t'); 

得到以下錯誤,當我使用OR條件: - :解析過程中的錯誤 豬加入使用或有條件的運營商拋出錯誤

ERROR org.apache.pig.PigServer異常分析過程中。豬腳本無法解析: NoViableAltException(91 @ [])解析失敗:豬腳本未能 解析:NoViableAltException(91 @ [])

回答

1

下面的語法不正確,沒有條件參加豬

childfirst = JOIN mainparent by (child_id_source), parents by (mother_id OR father_id); 

如果你想加入的關係與2個鍵另一個關係一個鍵,然後創建兩個連接和工會,你可能需要不同所產生的關係dataset.Note。

childfirst = JOIN mainparent by (child_id_source), parents by (mother_id); 
childfirst1 = JOIN mainparent by (child_id_source), parents by (father_id); 
childfirst2 = UNION childfirst,childfirst1; 
childfirst3 = DISTINCT childfirst2; 
firstgen = FOREACH childfirst3 GENERATE child_id, child_precode_id, child_code; 
store firstgen into 'file_location' using PigStorage('\t'); 
+0

這工作,它也解決了獲取重複的問題,謝謝 – Venkat