2017-10-05 232 views
0

我一直在努力在紗線集羣模式下使用spark 2.0.0運行示例工作,工作存在exitCode:-1000而沒有任何其他線索。相同的作業在本地模式下正常運行。Spark工作容器用退出代碼退出:-1000

火花命令:

spark-submit \ 
--conf "spark.yarn.stagingDir=/xyz/warehouse/spark" \ 
--queue xyz \ 
--class com.xyz.TestJob \ 
--master yarn \ 
--deploy-mode cluster \ 
--conf "spark.local.dir=/xyz/warehouse/tmp" \ 
/xyzpath/java-test-1.0-SNAPSHOT.jar [email protected] 

TestJob類:

public class TestJob { 
    public static void main(String[] args) throws InterruptedException { 
     SparkConf conf = new SparkConf(); 
     JavaSparkContext jsc = new JavaSparkContext(conf); 
     System.out.println(
       "TOtal count:"+ 
         jsc.parallelize(Arrays.asList(new Integer[]{1,2,3,4})).count()); 
     jsc.stop(); 
    } 
} 

錯誤日誌:

17/10/04 22:26:52 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED) 
17/10/04 22:26:52 INFO Client: 
     client token: N/A 
     diagnostics: N/A 
     ApplicationMaster host: N/A 
     ApplicationMaster RPC port: -1 
     queue: root.xyz 
     start time: 1507181210893 
     final status: UNDEFINED 
     tracking URL: http://xyzserver:8088/proxy/application_1506717704791_130756/ 
     user: xyz 
17/10/04 22:26:53 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED) 
17/10/04 22:26:54 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED) 
17/10/04 22:26:55 INFO Client: Application report for application_1506717704791_130756 (state: ACCEPTED) 
17/10/04 22:26:56 INFO Client: Application report for application_1506717704791_130756 (state: FAILED) 
17/10/04 22:26:56 INFO Client: 
     client token: N/A 
     diagnostics: Application application_1506717704791_130756 failed 5 times due to AM Container for appattempt_1506717704791_130756_000005 exited with exitCode: -1000 
For more detailed output, check application tracking page:http://xyzserver:8088/cluster/app/application_1506717704791_130756Then, click on links to logs of each attempt. 
Diagnostics: Failing this attempt. Failing the application. 
     ApplicationMaster host: N/A 
     ApplicationMaster RPC port: -1 
     queue: root.xyz 
     start time: 1507181210893 
     final status: FAILED 
     tracking URL: http://xyzserver:8088/cluster/app/application_1506717704791_130756 
     user: xyz 
17/10/04 22:26:56 INFO Client: Deleted staging directory /xyz/spark/.sparkStaging/application_1506717704791_130756 
Exception in thread "main" org.apache.spark.SparkException: Application application_1506717704791_130756 finished with failed status 
     at org.apache.spark.deploy.yarn.Client.run(Client.scala:1167) 
     at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1213) 

當我瀏覽頁面http://xyzserver:8088/cluster/app/application_1506717704791_130756它不存在。

無紗線應用程序日誌中發現 -

$yarn logs -applicationId application_1506717704791_130756 
/apps/yarn/logs/xyz/logs/application_1506717704791_130756 does not have any log files. 

可能是什麼這個錯誤的可能是根本原因,以及如何獲得詳細的錯誤日誌?

+0

你從來沒有申請開始運行。很可能是由於YARN上Spark的配置錯誤。你有沒有經歷過這個:https://spark.apache.org/docs/latest/running-on-yarn.html? – philantrovert

+0

問題出在一個配置參數上。當我刪除它開始工作。順便說一句,感謝您的評論。 –

回答

0

花了差不多一整天的時間後,我發現了根本原因。當我刪除spark.yarn.stagingDir它開始工作,我仍然不知道爲什麼火花抱怨它 -

上一頁星火呈交

spark-submit \ 
--conf "spark.yarn.stagingDir=/xyz/warehouse/spark" \ 
--queue xyz \ 
--class com.xyz.TestJob \ 
--master yarn \ 
--deploy-mode cluster \ 
--conf "spark.local.dir=/xyz/warehouse/tmp" \ 
/xyzpath/java-test-1.0-SNAPSHOT.jar [email protected] 

新建 -

spark-submit \ 
--queue xyz \ 
--class com.xyz.TestJob \ 
--master yarn \ 
--deploy-mode cluster \ 
--conf "spark.local.dir=/xyz/warehouse/tmp" \ 
/xyzpath/java-test-1.0-SNAPSHOT.jar [email protected]