2014-09-23 134 views
1

我目前正在創建一個工作流程,通過sqoop自動導入數據。我試圖做的是驗證通過此過程導入的行數(記錄)是否準確; sqoop提供的validate參數在這裏不起作用,因爲sqoop作業沒有導入單個表。Shell腳本通過Oozie

我創建了一個名爲「驗證」的操作,它將調用並執行一個名爲驗證的shell腳本。該驗證腳本執行以下操作步驟:

  • 計數和使用sqoop評估和自由形式的查詢
  • 串聯的存儲從DB源的行數和運行Word在HDFS內的不同分區數子目錄;這是一個循環功能
  • 它會刪除具有零線
  • 然後它會評估兩項罪名,如果失敗強制退出代碼,並返回任何分區,如果真

然而,當我運行它,我得到了下面的錯誤,這是不是給我的資料,我需要:

2014-09-22 19:03:59,156 INFO ShellActionExecutor:539 - USER[v523043] GROUP[-] TOKEN[-] APP[voipImportToHDFS] JOB[0000359-140905180027053-oozie-oozi-W] ACTION[[email protected]] action completed, external ID [null] 
2014-09-22 19:03:59,159 WARN ShellActionExecutor:542 - USER[v523043] GROUP[-] TOKEN[-] APP[voipImportToHDFS] JOB[0000359-140905180027053-oozie-oozi-W] ACTION[[email protected]] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1] 
2014-09-22 19:03:59,177 INFO ActionEndXCommand:539 - USER[v523043] GROUP[-] TOKEN[-] APP[voipImportToHDFS] JOB[0000359-140905180027053-oozie-oozi-W] ACTION[[email protected]] end executor for wf action 0000359-140905180027053-oozie-oozi-W with wf job 0000359-140905180027053-oozie-oozi-W 
2014-09-22 19:03:59,198 INFO ActionEndXCommand:539 - USER[v523043] GROUP[-] TOKEN[-] APP[voipImportToHDFS] JOB[0000359-140905180027053-oozie-oozi-W] ACTION[[email protected]] ERROR is considered as FAILED for SLA 

我的驗證腳本使用命令:

  • sqoop EVAL ...
  • 列表項

Hadoop的FS - 貓...

是否有我在這裏可以俯瞰兼容性問題?我需要配置不同的東西嗎?

我的驗證腳本(在建版):

for table in ${tables[*]} 
do 

    #Get the number of records from DB Exadata 
    verifiedCount=$(sqoop eval --connect $3 --query "SELECT COUNT(*) FROM $4.${tables[table]} WHERE INTRVL_DT = To_Date('$5')" | awk '/([0-9]+)/{print $2}') 
    #echo "Total Number of Records " $verifiedCount 

    #Count the number of rows imported 


    totalRows=0 
    for ((i=0;i<$mapJobs;i++)) { 
     count[$i]=$(hadoop fs -cat $6$7/${tables[table]}/$8/$9/$10/part-m-0000$i | wc -l) 
     totalRows=$((totalRows + ${count[$i]})) 
     #if value has 0 lines, remove the file from edgenode to limit overhead 
     if [ ${count[$i]} -eq "0" ] 
      then 
       hadoop fs -rmr $6$7/${tables[table]}/$8/$9/$10/part-m-0000$i 
       echo "Removing..." 
      fi 
    } 
    #echo values 
    if [ "$totalRows" -eq "$verifiedCount" ] 
    then 
     echo "evaluation=true" 
     evaluation=true 
    else 
     echo "evaluation=false" 
     evaluation=false 
     exit 40 
    fi 
done 

回答

0

蠻力錯誤調試使我找到下面這行錯誤:

hadoop fs -rmr $6$7/${tables[table]}/$8/$9/$10/part-m-0000$i 

具體而言,與-rmr做不能坐在環境中,導致其墜毀。