2016-08-19 87 views
3

1,版本 火花:2.0.0 階:2.11.8 的java:1.8.0_91 的Hadoop:2.7.2火花提交絲未分配罐子NM-本地目錄

2 ,問題: 當我提交的Scala程序引發紗線,它拋出一個異常:

Caused by: java.lang.IllegalStateException: Library directory '/opt/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1471514504287_0021/container_1471514504287_0021_01_000002/assembly/target/scala-2.11/jars' does not exist; make sure Spark is built. 

3,指揮

spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark collaborativeFilteringSpark.jar 

4,所有日誌:

16/08/19 11:07:35 INFO SparkContext: Running Spark version 2.0.0 
16/08/19 11:07:35 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
16/08/19 11:07:36 INFO SecurityManager: Changing view acls to: hadoop 
16/08/19 11:07:36 INFO SecurityManager: Changing modify acls to: hadoop 
16/08/19 11:07:36 INFO SecurityManager: Changing view acls groups to: 
16/08/19 11:07:36 INFO SecurityManager: Changing modify acls groups to: 
16/08/19 11:07:36 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set() 
16/08/19 11:07:36 INFO Utils: Successfully started service 'sparkDriver' on port 43981. 
16/08/19 11:07:36 INFO SparkEnv: Registering MapOutputTracker 
16/08/19 11:07:36 INFO SparkEnv: Registering BlockManagerMaster 
16/08/19 11:07:36 INFO DiskBlockManager: Created local directory at /opt/spark/blockmgr-57cf9a28-536c-4f03-83cc-c6a59cdeb825 
16/08/19 11:07:36 INFO MemoryStore: MemoryStore started with capacity 413.9 MB 
16/08/19 11:07:36 INFO SparkEnv: Registering OutputCommitCoordinator 
16/08/19 11:07:37 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
16/08/19 11:07:37 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.137.101:4040 
16/08/19 11:07:37 INFO SparkContext: Added JAR file:/home/hadoop/spark_program/scala/collaborativeFilteringSpark.jar at spark://192.168.137.101:43981/jars/collaborativeFilteringSpark.jar with timestamp 1471576057423 
16/08/19 11:07:38 INFO RMProxy: Connecting to ResourceManager at dev-01/192.168.137.101:8032 
16/08/19 11:07:38 INFO Client: Requesting a new application from cluster with 1 NodeManagers 
16/08/19 11:07:38 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 
16/08/19 11:07:38 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 
16/08/19 11:07:38 INFO Client: Setting up container launch context for our AM 
16/08/19 11:07:38 INFO Client: Setting up the launch environment for our AM container 
16/08/19 11:07:38 INFO Client: Preparing resources for our AM container 
16/08/19 11:07:39 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 
16/08/19 11:07:40 INFO Client: Uploading resource file:/opt/spark/spark-e7da4489-d07e-4c42-aa50-be789ad1943e/__spark_libs__7265506257548877328.zip -> hdfs://dev-01:9000/user/hadoop/.sparkStaging/application_1471514504287_0021/__spark_libs__7265506257548877328.zip 
16/08/19 11:07:44 INFO Client: Uploading resource file:/opt/spark/spark-e7da4489-d07e-4c42-aa50-be789ad1943e/__spark_conf__3473502575984181564.zip -> hdfs://dev-01:9000/user/hadoop/.sparkStaging/application_1471514504287_0021/__spark_conf__.zip 
16/08/19 11:07:44 INFO SecurityManager: Changing view acls to: hadoop 
16/08/19 11:07:44 INFO SecurityManager: Changing modify acls to: hadoop 
16/08/19 11:07:44 INFO SecurityManager: Changing view acls groups to: 
16/08/19 11:07:44 INFO SecurityManager: Changing modify acls groups to: 
16/08/19 11:07:44 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set() 
16/08/19 11:07:44 INFO Client: Submitting application application_1471514504287_0021 to ResourceManager 
16/08/19 11:07:44 INFO YarnClientImpl: Submitted application application_1471514504287_0021 
16/08/19 11:07:44 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1471514504287_0021 and attemptId None 
16/08/19 11:07:45 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:45 INFO Client: 
    client token: N/A 
    diagnostics: N/A 
    ApplicationMaster host: N/A 
    ApplicationMaster RPC port: -1 
    queue: default 
    start time: 1471576064764 
    final status: UNDEFINED 
    tracking URL: http://dev-01:8088/proxy/application_1471514504287_0021/ 
    user: hadoop 
16/08/19 11:07:46 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:47 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:48 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:49 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:50 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:51 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:52 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:53 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:54 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:55 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(null) 
16/08/19 11:07:55 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> dev-01, PROXY_URI_BASES -> http://dev-01:8088/proxy/application_1471514504287_0021), /proxy/application_1471514504287_0021 
16/08/19 11:07:55 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 
16/08/19 11:07:55 INFO Client: Application report for application_1471514504287_0021 (state: ACCEPTED) 
16/08/19 11:07:56 INFO Client: Application report for application_1471514504287_0021 (state: RUNNING) 
16/08/19 11:07:56 INFO Client: 
    client token: N/A 
    diagnostics: N/A 
    ApplicationMaster host: 192.168.137.102 
    ApplicationMaster RPC port: 0 
    queue: default 
    start time: 1471576064764 
    final status: UNDEFINED 
    tracking URL: http://dev-01:8088/proxy/application_1471514504287_0021/ 
    user: hadoop 
16/08/19 11:07:56 INFO YarnClientSchedulerBackend: Application application_1471514504287_0021 has started running. 
16/08/19 11:07:56 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46171. 
16/08/19 11:07:56 INFO NettyBlockTransferService: Server created on 192.168.137.101:46171 
16/08/19 11:07:56 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.137.101, 46171) 
16/08/19 11:07:56 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.137.101:46171 with 413.9 MB RAM, BlockManagerId(driver, 192.168.137.101, 46171) 
16/08/19 11:07:56 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.137.101, 46171) 
16/08/19 11:08:03 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.137.102:42406) with ID 1 
16/08/19 11:08:03 INFO BlockManagerMasterEndpoint: Registering block manager dev-02:35791 with 413.9 MB RAM, BlockManagerId(1, dev-02, 35791) 
16/08/19 11:08:05 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.137.102:42410) with ID 2 
16/08/19 11:08:05 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 
16/08/19 11:08:05 INFO BlockManagerMasterEndpoint: Registering block manager dev-02:37169 with 413.9 MB RAM, BlockManagerId(2, dev-02, 37169) 
16/08/19 11:08:06 INFO SparkContext: Starting job: foreach at CollaborativeFilteringSpark.scala:62 
16/08/19 11:08:06 INFO DAGScheduler: Got job 0 (foreach at CollaborativeFilteringSpark.scala:62) with 2 output partitions 
16/08/19 11:08:06 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at CollaborativeFilteringSpark.scala:62) 
16/08/19 11:08:06 INFO DAGScheduler: Parents of final stage: List() 
16/08/19 11:08:06 INFO DAGScheduler: Missing parents: List() 
16/08/19 11:08:06 INFO DAGScheduler: Submitting ResultStage 0 (ParallelCollectionRDD[0] at parallelize at CollaborativeFilteringSpark.scala:18), which has no missing parents 
16/08/19 11:08:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1432.0 B, free 413.9 MB) 
16/08/19 11:08:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1035.0 B, free 413.9 MB) 
16/08/19 11:08:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.137.101:46171 (size: 1035.0 B, free: 413.9 MB) 
16/08/19 11:08:06 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1012 
16/08/19 11:08:06 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (ParallelCollectionRDD[0] at parallelize at CollaborativeFilteringSpark.scala:18) 
16/08/19 11:08:06 INFO YarnScheduler: Adding task set 0.0 with 2 tasks 
16/08/19 11:08:06 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, dev-02, partition 0, PROCESS_LOCAL, 5417 bytes) 
16/08/19 11:08:06 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, dev-02, partition 1, PROCESS_LOCAL, 5423 bytes) 
16/08/19 11:08:06 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 0 on executor id: 2 hostname: dev-02. 
16/08/19 11:08:06 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 1 on executor id: 1 hostname: dev-02. 
16/08/19 11:08:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on dev-02:37169 (size: 1035.0 B, free: 413.9 MB) 
16/08/19 11:08:07 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on dev-02:35791 (size: 1035.0 B, free: 413.9 MB) 
16/08/19 11:08:13 WARN TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1, dev-02): java.lang.ExceptionInInitializerError 
    at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$$anonfun$main$1.apply(CollaborativeFilteringSpark.scala:64) 
    at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$$anonfun$main$1.apply(CollaborativeFilteringSpark.scala:62) 
    at scala.collection.Iterator$class.foreach(Iterator.scala:893) 
    at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28) 
    at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:875) 
    at org.apache.spark.rdd.RDD$$anonfun$foreach$1$$anonfun$apply$27.apply(RDD.scala:875) 
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1897) 
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1897) 
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) 
    at org.apache.spark.scheduler.Task.run(Task.scala:85) 
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.IllegalStateException: Library directory '/opt/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1471514504287_0021/container_1471514504287_0021_01_000002/assembly/target/scala-2.11/jars' does not exist; make sure Spark is built. 
    at org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248) 
    at org.apache.spark.launcher.CommandBuilderUtils.findJarsDir(CommandBuilderUtils.java:368) 
    at org.apache.spark.launcher.YarnCommandBuilderUtils$.findJarsDir(YarnCommandBuilderUtils.scala:38) 
    at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:500) 
    at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:834) 
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:167) 
    at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56) 
    at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:149) 
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:500) 
    at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$.<init>(CollaborativeFilteringSpark.scala:16) 
    at org.apache.spark.mllib.learning.recommend.CollaborativeFilteringSpark$.<clinit>(CollaborativeFilteringSpark.scala) 
    ... 14 more 

16/08/19 11:08:13 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 2, dev-02, partition 1, PROCESS_LOCAL, 5423 bytes) 
16/08/19 11:08:13 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 2 on executor id: 1 hostname: dev-02. 
16/08/19 11:08:13 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on executor dev-02: java.lang.ExceptionInInitializerError (null) [duplicate 1] 
16/08/19 11:08:13 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 3, dev-02, partition 0, PROCESS_LOCAL, 5417 bytes) 
16/08/19 11:08:13 INFO YarnSchedulerBackend$YarnDriverEndpoint: Launching task 3 on executor id: 2 hostname: dev-02. 
16/08/19 11:08:14 WARN TransportChannelHandler: Exception in connection from /192.168.137.102:42406 
java.io.IOException: Connection reset by peer 
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method) 
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) 
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) 
    at sun.nio.ch.IOUtil.read(IOUtil.java:192) 
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313) 
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) 
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) 
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) 
    at java.lang.Thread.run(Thread.java:745) 
16/08/19 11:08:14 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 1. 
16/08/19 11:08:14 INFO DAGScheduler: Executor lost: 1 (epoch 0) 
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. 
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(1, dev-02, 35791) 
16/08/19 11:08:14 INFO BlockManagerMaster: Removed 1 successfully in removeExecutor 
16/08/19 11:08:14 WARN TransportChannelHandler: Exception in connection from /192.168.137.102:42410 
java.io.IOException: Connection reset by peer 
    at sun.nio.ch.FileDispatcherImpl.read0(Native Method) 
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) 
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) 
    at sun.nio.ch.IOUtil.read(IOUtil.java:192) 
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) 
    at io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:313) 
    at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881) 
    at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:242) 
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) 
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) 
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) 
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) 
    at java.lang.Thread.run(Thread.java:745) 
16/08/19 11:08:14 INFO YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 2. 
16/08/19 11:08:14 INFO DAGScheduler: Executor lost: 2 (epoch 1) 
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Trying to remove executor 2 from BlockManagerMaster. 
16/08/19 11:08:14 INFO BlockManagerMasterEndpoint: Removing block manager BlockManagerId(2, dev-02, 37169) 
16/08/19 11:08:14 INFO BlockManagerMaster: Removed 2 successfully in removeExecutor 
16/08/19 11:08:14 WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_1471514504287_0021_01_000002 on host: dev-02. Exit status: 50. Diagnostics: Exception from container-launch. 
Container id: container_1471514504287_0021_01_000002 
Exit code: 50 
Stack trace: ExitCodeException exitCode=50: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) 
    at org.apache.hadoop.util.Shell.run(Shell.java:456) 
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) 
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) 
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 


Container exited with a non-zero exit code 50 

16/08/19 11:08:14 ERROR YarnScheduler: Lost executor 1 on dev-02: Container marked as failed: container_1471514504287_0021_01_000002 on host: dev-02. Exit status: 50. Diagnostics: Exception from container-launch. 
Container id: container_1471514504287_0021_01_000002 
Exit code: 50 
Stack trace: ExitCodeException exitCode=50: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) 
    at org.apache.hadoop.util.Shell.run(Shell.java:456) 
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) 
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212) 
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 


Container exited with a non-zero exit code 50 
+0

檢查你的'spark-submit.sh'文件中正確引用的jar文件,並檢查jar文件的位置,似乎是由於缺少jar文件而導致的錯誤。 –

+0

所有罐子都在$ SPARK_HOME/jars文件夾中。我認爲問題是爲什麼所有罐子都沒有從hdfs分發給nodemanager。我看到日誌中的所有jar包都已經上傳到hdfs。 –

回答

1

確保SPARK_HOME環境變量在羣集中正確提取。當spark-shell嘗試查找spark庫時發生此類錯誤,但由於未設置SPARK_HOME,因此無法找到庫。