0
你的幫助是非常讚賞。我發現的最接近的錯誤報告是https://issues.apache.org/jira/browse/SPARK-7837。有其他人看過這個問題嗎?如果您在下面的堆棧跟蹤中發現錯誤並知道錯誤,請告訴我。saveAsParquetFile失敗兩個分區,並沒有分區
當我打電話df.repartition(1).saveAsParquetFile()或df.saveAsParquetFile()無法保存在實木複合地板的文件行數據,看看下面的堆棧跟蹤:
Name: org.apache.spark.SparkException
Message: Job aborted.
StackTrace: org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.insert(commands.scala:166)
org.apache.spark.sql.sources.InsertIntoHadoopFsRelation.run(commands.scala:139)
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:57)
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:57)
org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:68)
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88)
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:88)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:87)
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:950)
org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:950)
org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:336)
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:144)
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:135)
org.apache.spark.sql.DataFrame.saveAsParquetFile(DataFrame.scala:1508)
$line46.$read$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:22)
$line46.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:27)
$line46.$read$$iwC$$iwC$$iwC.<init>(<console>:29)
$line46.$read$$iwC$$iwC.<init>(<console>:31)
$line46.$read$$iwC.<init>(<console>:33)
$line46.$read.<init>(<console>:35)
$line46.$read$.<init>(<console>:39)
$line46.$read$.<clinit>(<console>)
java.lang.J9VMInternals.initializeImpl(Native Method)
java.lang.J9VMInternals.initialize(J9VMInternals.java:235)
$line46.$eval$.<init>(<console>:7)
$line46.$eval$.<clinit>(<console>)
java.lang.J9VMInternals.initializeImpl(Native Method)
java.lang.J9VMInternals.initialize(J9VMInternals.java:235)
$line46.$eval.$print(<console>)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:95)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:56)
java.lang.reflect.Method.invoke(Method.java:620)
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1338)
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1$$anonfun$ apply$3.apply(ScalaInterpreter.scala:296)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1$$anonfun$ apply$3.apply(ScalaInterpreter.scala:291)
com.ibm.spark.global.StreamState$.withStreams(StreamState.scala:80)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1.apply(ScalaInterpreter.scala:290)
com.ibm.spark.interpreter.ScalaInterpreter$$anonfun$interpretAddTask$1.apply(ScalaInterpreter.scala:290)
com.ibm.spark.utils.TaskManager$$anonfun$add$2$$anon$1.run(TaskManager.scala:123)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1157)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627)
java.lang.Thread.run(Thread.java:801)
我在bluemix spark服務上收到同樣的錯誤。你能解決這個問題嗎? –