0

我嘗試在Spring Cloud Data for Yarn上運行簡單的彈簧批處理任務。可惜的是在運行它,我在ResourceManager的UI得到錯誤信息:彈簧數據流紗線 - 無法訪問jarfile

Application application_1473838120587_5156 failed 1 times due to AM Container for appattempt_1473838120587_5156_000001 exited with exitCode: 1 
For more detailed output, check application tracking page:http://ip-10-249-9-50.gc.stepstone.com:8088/cluster/app/application_1473838120587_5156Then, click on links to logs of each attempt. 
Diagnostics: Exception from container-launch. 
Container id: container_1473838120587_5156_01_000001 
Exit code: 1 
Stack trace: ExitCodeException exitCode=1: 
at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) 
at org.apache.hadoop.util.Shell.run(Shell.java:456) 
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) 
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) 
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745) 
Container exited with a non-zero exit code 1 
Failing this attempt. Failing the application. 

更多來自Appmaster.stderror信息指出:

Log Type: Appmaster.stderr 
Log Upload Time: Mon Nov 07 12:59:57 +0000 2016 
Log Length: 106 
Error: Unable to access jarfile spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.BUILD-SNAPSHOT.jar 

如果涉及到春季雲數據流量我試圖運行dataflow-shell:

app register --type task --name simple_batch_job --uri https://github.com/spring-cloud/spring-cloud-dataflow-samples/raw/master/tasks/simple-batch-job/batch-job-1.0.0.BUILD-SNAPSHOT.jar 
task create foo --definition "simple_batch_job" 
task launch foo 

其實很難知道爲什麼會出現這個錯誤。我確信從數據流服務器到紗線的連接工作正常,因爲在標準的HDFS本地化(/數據流)中,一些文件被複制(servers.yml,帶有作業和實用程序的jar),但在某些方面無法訪問。

我servers.yml配置:

logging: 
    level: 
    org.apache.hadoop: DEBUG 
    org.springframework.yarn: DEBUG 
maven: 
    remoteRepositories: 
    springRepo: 
     url: https://repo.spring.io/libs-snapshot 
spring: 
    main: 
    show_banner: false 
    hadoop: 
    fsUri: hdfs://HOST:8020 
    resourceManagerHost: HOST 
    resourceManagerPort: 8032 
    resourceManagerSchedulerAddress: HOST:8030 
datasource: 
    url: jdbc:h2:tcp://localhost:19092/mem:dataflow 
    username: sa 
    password: 
    driverClassName: org.h2.Driver 

我會很高興聽到任何信息或彈簧絲提示&技巧,使這項工作。

PS:由於Hadoop的環境中我使用Amazon EMR 5.0

編輯:遞歸路徑從HDFS:

drwxrwxrwx - user hadoop   0 2016-11-07 15:02 /dataflow/apps 
drwxrwxrwx - user hadoop   0 2016-11-07 15:02 /dataflow/apps/stream 
drwxrwxrwx - user hadoop   0 2016-11-07 15:04 /dataflow/apps/stream/app 
-rwxrwxrwx 3 user hadoop  121 2016-11-07 15:05 /dataflow/apps/stream/app/application.properties 
-rwxrwxrwx 3 user hadoop  1177 2016-11-07 15:04 /dataflow/apps/stream/app/servers.yml 
-rwxrwxrwx 3 user hadoop 60202852 2016-11-07 15:04 /dataflow/apps/stream/app/spring-cloud-deployer-yarn-appdeployerappmaster-1.0.0.RELEASE.jar 
drwxrwxrwx - user hadoop   0 2016-11-04 14:22 /dataflow/apps/task 
drwxrwxrwx - user hadoop   0 2016-11-04 14:24 /dataflow/apps/task/app 
-rwxrwxrwx 3 user hadoop  121 2016-11-04 14:25 /dataflow/apps/task/app/application.properties 
-rwxrwxrwx 3 user hadoop  2101 2016-11-04 14:24 /dataflow/apps/task/app/servers.yml 
-rwxrwxrwx 3 user hadoop 60198804 2016-11-04 14:24 /dataflow/apps/task/app/spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.RELEASE.jar 
drwxrwxrwx - user hadoop   0 2016-11-04 14:25 /dataflow/artifacts 
drwxrwxrwx - user hadoop   0 2016-11-07 15:06 /dataflow/artifacts/cache 
-rwxrwxrwx 3 user hadoop 12323493 2016-11-04 14:25 /dataflow/artifacts/cache/https-c84ea9dc0103a4754aeb9a28bbc7a4f33c835854-batch-job-1.0.0.BUILD-SNAPSHOT.jar 
-rwxrwxrwx 3 user hadoop 22139318 2016-11-07 15:07 /dataflow/artifacts/cache/log-sink-rabbit-1.0.0.BUILD-SNAPSHOT.jar 
-rwxrwxrwx 3 user hadoop 12590921 2016-11-07 12:59 /dataflow/artifacts/cache/timestamp-task-1.0.0.BUILD-SNAPSHOT.jar 
+0

那麼你首先可以檢查hdfs中的「/ dataflow」dir是否存在,如果存在,那麼它將遞歸地記錄哪些文件。如果沒有,用戶是否有權創建該目錄。 –

+0

HDFS中有一個/ dataflow目錄,所有的設置和jar文件都被複制到那裏(包括spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.BUILD-SNAPSHOT.jar)。數據流目錄已完全訪問(777) – Ragnar

+0

您可以將完全遞歸的'/ dataflow'目錄列表添加到問題中。看起來像appmaster jar沒有被本地化到一個容器中,所以有些事情是錯誤的,那些hdfs中的文件首先是可疑的。 –

回答

0

有明顯錯誤的版本的組合作爲HDFS擁有spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.RELEASE.jar和錯誤抱怨spring-cloud-deployer-yarn-tasklauncherappmaster-1.0.0.BUILD-SNAPSHOT.jar

不知道你如何獲得快照,除非你手動建立分佈?

我建議從http://cloud.spring.io/spring-cloud-dataflow-server-yarn挑選1.0.2。請參閱參考資料中的「下載和提取分發」。同時從hdfs中刪除舊的/dataflow目錄。