2017-06-01 90 views
4

我已經使用sbt構建了一個簡單的Spark應用程序。這裏是我的代碼:運行Spark SBT應用程序時,爲什麼需要添加「fork in run:= true」?

import org.apache.spark.sql.SparkSession 

object HelloWorld { 
    def main(args: Array[String]): Unit = { 
    val spark = SparkSession.builder().master("local").appName("BigApple").getOrCreate() 

    import spark.implicits._ 

    val ds = Seq(1, 2, 3).toDS() 
    ds.map(_ + 1).foreach(x => println(x)) 
    } 
} 

以下是我build.sbt

name := """sbt-sample-app""" 

version := "1.0" 

scalaVersion := "2.11.7" 

libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test" 
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.1.1" 

現在,當我嘗試做sbt run,它給了我以下錯誤:

$ sbt run 
[info] Loading global plugins from /home/user/.sbt/0.13/plugins 
[info] Loading project definition from /home/user/Projects/sample-app/project 
[info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/) 
[info] Running HelloWorld 
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
17/06/01 10:09:10 INFO SparkContext: Running Spark version 2.1.1 
17/06/01 10:09:11 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
17/06/01 10:09:11 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0) 
17/06/01 10:09:11 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 
17/06/01 10:09:11 INFO SecurityManager: Changing view acls to: user 
17/06/01 10:09:11 INFO SecurityManager: Changing modify acls to: user 
17/06/01 10:09:11 INFO SecurityManager: Changing view acls groups to: 
17/06/01 10:09:11 INFO SecurityManager: Changing modify acls groups to: 
17/06/01 10:09:11 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user); groups with view permissions: Set(); users with modify permissions: Set(user); groups with modify permissions: Set() 
17/06/01 10:09:12 INFO Utils: Successfully started service 'sparkDriver' on port 39662. 
17/06/01 10:09:12 INFO SparkEnv: Registering MapOutputTracker 
17/06/01 10:09:12 INFO SparkEnv: Registering BlockManagerMaster 
17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 
17/06/01 10:09:12 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 
17/06/01 10:09:12 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-c6db1535-6a00-4760-93dc-968722e3d596 
17/06/01 10:09:12 INFO MemoryStore: MemoryStore started with capacity 408.9 MB 
17/06/01 10:09:13 INFO SparkEnv: Registering OutputCommitCoordinator 
17/06/01 10:09:13 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
17/06/01 10:09:13 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040 
17/06/01 10:09:13 INFO Executor: Starting executor ID driver on host localhost 
17/06/01 10:09:13 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34488. 
17/06/01 10:09:13 INFO NettyBlockTransferService: Server created on 127.0.0.1:34488 
17/06/01 10:09:13 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 
17/06/01 10:09:13 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None) 
17/06/01 10:09:13 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:34488 with 408.9 MB RAM, BlockManagerId(driver, 127.0.0.1, 34488, None) 
17/06/01 10:09:13 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 34488, None) 
17/06/01 10:09:13 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 34488, None) 
17/06/01 10:09:14 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse'. 
[error] (run-main-0) scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter(
[error] parent = URLClassLoader with NativeCopyLoader with RawResources(
[error] urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ...,/home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), 
[error] parent = [email protected], 
[error] resourceMap = Set(app.class.path, boot.class.path), 
[error] nativeTemp = /tmp/sbt_c2afce 
[error]) 
[error] root = [email protected] 
[error] cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar) 
[error]) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources(
[error] urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), 
[error] parent = [email protected], 
[error] resourceMap = Set(app.class.path, boot.class.path), 
[error] nativeTemp = /tmp/sbt_c2afce 
[error]) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,...openjdk-amd64/jre/lib/jfr.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/classes] not found. 
scala.ScalaReflectionException: class scala.Option in JavaMirror with ClasspathFilter(
    parent = URLClassLoader with NativeCopyLoader with RawResources(
    urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), 
    parent = [email protected], 
    resourceMap = Set(app.class.path, boot.class.path), 
    nativeTemp = /tmp/sbt_c2afce 
) 
    root = [email protected] 
    cp = Set(/home/user/.ivy2/cache/org.glassfish.jersey.core/jersey-common/jars/jersey-common-2.22.2.jar, ..., /home/user/.ivy2/cache/net.razorvine/pyrolite/jars/pyrolite-4.13.jar) 
) of type class sbt.classpath.ClasspathFilter with classpath [<unknown>] and parent being URLClassLoader with NativeCopyLoader with RawResources(
    urls = List(/home/user/Projects/sample-app/target/scala-2.11/classes, ..., /home/user/.ivy2/cache/org.apache.parquet/parquet-jackson/jars/parquet-jackson-1.8.1.jar), 
    parent = [email protected], 
    resourceMap = Set(app.class.path, boot.class.path), 
    nativeTemp = /tmp/sbt_c2afce 
) of type class sbt.classpath.ClasspathUtilities$$anon$1 with classpath [file:/home/user/Projects/sample-app/target/scala-2.11/classes/,.../jre/lib/charsets.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/jfr.jar:/usr/lib/jvm/java-8-openjdk-amd64/jre/classes] not found. 
    at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:123) 
    at scala.reflect.internal.Mirrors$RootsBase.staticClass(Mirrors.scala:22) 
    at org.apache.spark.sql.catalyst.ScalaReflection$$typecreator42$1.apply(ScalaReflection.scala:614) 
    at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe$lzycompute(TypeTags.scala:232) 
    at scala.reflect.api.TypeTags$WeakTypeTagImpl.tpe(TypeTags.scala:232) 
    at org.apache.spark.sql.catalyst.ScalaReflection$class.localTypeOf(ScalaReflection.scala:782) 
    at org.apache.spark.sql.catalyst.ScalaReflection$.localTypeOf(ScalaReflection.scala:39) 
    at org.apache.spark.sql.catalyst.ScalaReflection$.optionOfProductType(ScalaReflection.scala:614) 
    at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:51) 
    at org.apache.spark.sql.Encoders$.scalaInt(Encoders.scala:281) 
    at org.apache.spark.sql.SQLImplicits.newIntEncoder(SQLImplicits.scala:54) 
    at HelloWorld$.main(HelloWorld.scala:9) 
    at HelloWorld.main(HelloWorld.scala) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
[trace] Stack trace suppressed: run last compile:run for the full output. 
17/06/01 10:09:15 ERROR ContextCleaner: Error in cleaning thread 
java.lang.InterruptedException 
    at java.lang.Object.wait(Native Method) 
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) 
    at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:181) 
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245) 
    at org.apache.spark.ContextCleaner.org$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:178) 
    at org.apache.spark.ContextCleaner$$anon$1.run(ContextCleaner.scala:73) 
17/06/01 10:09:15 ERROR Utils: uncaught error in thread SparkListenerBus, stopping SparkContext 
java.lang.InterruptedException 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) 
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) 
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78) 
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77) 
17/06/01 10:09:15 ERROR Utils: throw uncaught fatal error in thread SparkListenerBus 
java.lang.InterruptedException 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:998) 
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) 
    at java.util.concurrent.Semaphore.acquire(Semaphore.java:312) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(LiveListenerBus.scala:80) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1$$anonfun$apply$mcV$sp$1.apply(LiveListenerBus.scala:79) 
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(LiveListenerBus.scala:78) 
    at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1245) 
    at org.apache.spark.scheduler.LiveListenerBus$$anon$1.run(LiveListenerBus.scala:77) 
17/06/01 10:09:15 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040 
java.lang.RuntimeException: Nonzero exit code: 1 
    at scala.sys.package$.error(package.scala:27) 
[trace] Stack trace suppressed: run last compile:run for the full output. 
[error] (compile:run) Nonzero exit code: 1 
[error] Total time: 7 s, completed 1 Jun, 2017 10:09:15 AM 

但是,當我在build.sbt添加fork in run := true該應用程序運行良好

build.sbt

name := """sbt-sample-app""" 

version := "1.0" 

scalaVersion := "2.11.7" 

libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test" 
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.1.1" 

fork in run := true 

下面是輸出:

$ sbt run 
[info] Loading global plugins from /home/user/.sbt/0.13/plugins 
[info] Loading project definition from /home/user/Projects/sample-app/project 
[info] Set current project to sbt-sample-app (in build file:/home/user/Projects/sample-app/) 
[success] Total time: 0 s, completed 1 Jun, 2017 10:15:43 AM 
[info] Updating {file:/home/user/Projects/sample-app/}sample-app... 
[info] Resolving jline#jline;2.12.1 ... 
[info] Done updating. 
[warn] Scala version was updated by one of library dependencies: 
[warn] * org.scala-lang:scala-library:(2.11.7, 2.11.0) -> 2.11.8 
[warn] To force scalaVersion, add the following: 
[warn] ivyScala := ivyScala.value map { _.copy(overrideScalaVersion = true) } 
[warn] Run 'evicted' to see detailed eviction warnings 
[info] Compiling 1 Scala source to /home/user/Projects/sample-app/target/scala-2.11/classes... 
[info] Running HelloWorld 
[error] Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 
[error] 17/06/01 10:16:13 INFO SparkContext: Running Spark version 2.1.1 
[error] 17/06/01 10:16:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
[error] 17/06/01 10:16:14 WARN Utils: Your hostname, user-Vostro-15-3568 resolves to a loopback address: 127.0.1.1; using 127.0.0.1 instead (on interface enp3s0) 
[error] 17/06/01 10:16:14 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls to: user 
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls to: user 
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing view acls groups to: 
[error] 17/06/01 10:16:14 INFO SecurityManager: Changing modify acls groups to: 
[error] 17/06/01 10:16:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user); groups with view permissions: Set(); users with modify permissions: Set(user); groups with modify permissions: Set() 
[error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'sparkDriver' on port 37747. 
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering MapOutputTracker 
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering BlockManagerMaster 
[error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 
[error] 17/06/01 10:16:14 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 
[error] 17/06/01 10:16:14 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-edf40c39-a13e-4930-8e9a-64135bfa9770 
[error] 17/06/01 10:16:14 INFO MemoryStore: MemoryStore started with capacity 1405.2 MB 
[error] 17/06/01 10:16:14 INFO SparkEnv: Registering OutputCommitCoordinator 
[error] 17/06/01 10:16:14 INFO Utils: Successfully started service 'SparkUI' on port 4040. 
[error] 17/06/01 10:16:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://127.0.0.1:4040 
[error] 17/06/01 10:16:15 INFO Executor: Starting executor ID driver on host localhost 
[error] 17/06/01 10:16:15 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 39113. 
[error] 17/06/01 10:16:15 INFO NettyBlockTransferService: Server created on 127.0.0.1:39113 
[error] 17/06/01 10:16:15 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 
[error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None) 
[error] 17/06/01 10:16:15 INFO BlockManagerMasterEndpoint: Registering block manager 127.0.0.1:39113 with 1405.2 MB RAM, BlockManagerId(driver, 127.0.0.1, 39113, None) 
[error] 17/06/01 10:16:15 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 127.0.0.1, 39113, None) 
[error] 17/06/01 10:16:15 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 127.0.0.1, 39113, None) 
[error] 17/06/01 10:16:15 INFO SharedState: Warehouse path is 'file:/home/user/Projects/sample-app/spark-warehouse/'. 
[error] 17/06/01 10:16:18 INFO CodeGenerator: Code generated in 395.134683 ms 
[error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 9.077969 ms 
[error] 17/06/01 10:16:19 INFO CodeGenerator: Code generated in 23.652705 ms 
[error] 17/06/01 10:16:19 INFO SparkContext: Starting job: foreach at HelloWorld.scala:10 
[error] 17/06/01 10:16:19 INFO DAGScheduler: Got job 0 (foreach at HelloWorld.scala:10) with 1 output partitions 
[error] 17/06/01 10:16:19 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at HelloWorld.scala:10) 
[error] 17/06/01 10:16:19 INFO DAGScheduler: Parents of final stage: List() 
[error] 17/06/01 10:16:19 INFO DAGScheduler: Missing parents: List() 
[error] 17/06/01 10:16:19 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10), which has no missing parents 
[error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.3 KB, free 1405.2 MB) 
[error] 17/06/01 10:16:20 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.3 KB, free 1405.2 MB) 
[error] 17/06/01 10:16:20 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 127.0.0.1:39113 (size: 3.3 KB, free: 1405.2 MB) 
[error] 17/06/01 10:16:20 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:996 
[error] 17/06/01 10:16:20 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at foreach at HelloWorld.scala:10) 
[error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 
[error] 17/06/01 10:16:20 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 6227 bytes) 
[error] 17/06/01 10:16:20 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 
[info] 2 
[info] 3 
[info] 4 
[error] 17/06/01 10:16:20 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1231 bytes result sent to driver 
[error] 17/06/01 10:16:20 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 152 ms on localhost (executor driver) (1/1) 
[error] 17/06/01 10:16:20 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
[error] 17/06/01 10:16:20 INFO DAGScheduler: ResultStage 0 (foreach at HelloWorld.scala:10) finished in 0.181 s 
[error] 17/06/01 10:16:20 INFO DAGScheduler: Job 0 finished: foreach at HelloWorld.scala:10, took 0.596960 s 
[error] 17/06/01 10:16:20 INFO SparkContext: Invoking stop() from shutdown hook 
[error] 17/06/01 10:16:20 INFO SparkUI: Stopped Spark web UI at http://127.0.0.1:4040 
[error] 17/06/01 10:16:20 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 
[error] 17/06/01 10:16:20 INFO MemoryStore: MemoryStore cleared 
[error] 17/06/01 10:16:20 INFO BlockManager: BlockManager stopped 
[error] 17/06/01 10:16:20 INFO BlockManagerMaster: BlockManagerMaster stopped 
[error] 17/06/01 10:16:20 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 
[error] 17/06/01 10:16:20 INFO SparkContext: Successfully stopped SparkContext 
[error] 17/06/01 10:16:20 INFO ShutdownHookManager: Shutdown hook called 
[error] 17/06/01 10:16:20 INFO ShutdownHookManager: Deleting directory /tmp/spark-77d00e78-9f76-4ab2-bc40-0b99940661ac 
[success] Total time: 37 s, completed 1 Jun, 2017 10:16:20 AM 

誰能幫我理解這背後的原因是什麼?

+1

w帽子版本的sbt你在用嗎? 'sbt sbtVersion'將打印版本。 – marios

+0

@marios我正在使用sbt v0.13.13。 – himanshuIIITian

回答

6

從 「入門SBT斯卡拉」 通過詩體摘錄Saxena

Why do we need to fork JVM?

When a user runs code using run or console commands, the code is run on the same virtual machine as SBT. In some cases, running of code may cause SBT to crash, such as a System.exit call or unterminated threads (for example, when running tests on code while simultaneously working on the code).

If a test causes the JVM to shut down, you would need to restart SBT. In order to avoid such scenarious, forking the JVM is important.

You do not need to fork the JVM to run your code if the code follows the constraints listed as follows, else it must be run in a forked JVM:

  • No threads are created or the program ends when user-created threads terminate on their own
  • System.exit is used to end the program and user-created threads terminate when interrupted
  • No deserialization is done or deserialization code ensures that the right class loader is used
+0

這是迄今爲止對我的查詢最好的解釋。 – himanshuIIITian

+0

這工作解決我的問題與深刻的神祕錯誤:'[錯誤](run-main-0)scala.ScalaReflectionException:JavaMirror類中的類scala.Option與ClasspathFilter'。我*認爲*由於這個解決方案有效,火花必須做三個子彈中的一個,但是我無法從答案中告訴哪一個或者爲什麼。奇怪的是,這種需要叉是沒有在Spark Scala文檔中提及,或者我錯過了它。 – FrobberOfBits

+0

嘿@FrobberOfBits 您能否提供更多的上下文?導致問題的代碼示例可能是?很難說明目前爲止提供的信息可能是什麼原因 – ZakukaZ

0

從給定的here

默認情況下,文檔,運行任務在同一個JVM運行SBT。但是,在某些情況下需要分叉。或者,您可能希望在執行新任務時分叉Java進程。

默認情況下,分叉進程使用與構建以及當前進程的工作目錄和JVM選項相同的Java和Scala版本。本頁討論如何爲運行和測試任務啓用和配置分叉。如下所述,每種任務都可以通過對相關鍵進行範圍劃分來單獨配置。

,以使在運行叉只使用

fork in run := true 
+3

我不覺得這回答了這個問題。 –

+1

感謝您的快速響應!但我無法瞭解分叉與Spark的關係。我的意思是運行一個通常的scala應用程序時不需要分叉。 – himanshuIIITian

+0

SBT,Scala,Spark,Java。他們都一樣。它們是在JVM下運行的字節碼。在單個JVM進程中,類加載是共享的,sbt會做一些技巧以使共享相同類路徑的不同版本成爲可能。這個技巧並不總是奏效。單個JVM與分叉JVM有其他問題,可能會導致處理IO等問題。 – pedrofurla

0

我無法找到確切的原因:

但是這是他們構建文件與建議:

https://github.com/deanwampler/spark-scala-tutorial/blob/master/project/Build.scala

希望有人可以給出更好的答案。

編輯代碼:

進口org.apache.spark.sql.SparkSession

object HelloWorld { 
def main(args: Array[String]): Unit = { 
    val spark = SparkSession.builder().master("local").appName("BigApple").getOrCreate() 

import spark.implicits._ 

val ds = Seq(1, 2, 3).toDS() 
ds.map(_ + 1).foreach(x => println(x)) 
} 
} 

build.sbt

name := """untitled""" 

version := "1.0" 

scalaVersion := "2.11.7" 

libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.6" % "test" 
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.1.1" 
+0

感謝這個例子!但是在上面提到的構建文件中,我發現唯一的評論是'//更好地在單獨的JVM中運行示例和測試。 fork:= true,' – himanshuIIITian

+0

如果我不得不拍攝一張照片,我會假設這是因爲火花在單獨的線程中運行,並且可能會或可能不會像代碼那樣停止。所以當你完成後,你可以嘗試spark.close嗎?它現在應該工作。 –

+0

無叉工作,就是我的意思。你可否確認? –

相關問題