2016-08-05 90 views
0

我是新來的Spark和我目前正在嘗試使用deeplearning4j api建立一個神經網絡。培訓工作很好,但我在評估時遇到問題。我得到的火花以下錯誤消息Deeplearning4j與Spark:SparkDl4jMultiLayer評估與JavaRDD <DataSet>

18:16:16,206 ERROR ~ Exception in task 0.0 in stage 14.0 (TID 19) 
java.lang.IllegalStateException: Network did not have same number of parameters as the broadcasted set parameters at org.deeplearning4j.spark.impl.multilayer.evaluation.EvaluateFlatMapFunction.call(EvaluateFlatMapFunction.java:75) 

我似乎無法找到這個問題的原因,以及信息和deeplearning4j稀疏。我基本上從這個例子https://github.com/deeplearning4j/dl4j-spark-cdh5-examples/blob/2de0324076fb422e2bdb926a095adb97c6d0e0ca/src/main/java/org/deeplearning4j/examples/mlp/IrisLocal.java這個結構。

這是我的代碼

public class DeepBeliefNetwork { 

private JavaRDD<DataSet> trainSet; 
private JavaRDD<DataSet> testSet; 

private int inputSize; 
private int numLab; 
private int batchSize; 
private int iterations; 
private int seed; 
private int listenerFreq; 
MultiLayerConfiguration conf; 
MultiLayerNetwork model; 
SparkDl4jMultiLayer sparkmodel; 
JavaSparkContext sc; 

MLLibUtil mllibUtil = new MLLibUtil(); 

public DeepBeliefNetwork(JavaSparkContext sc, JavaRDD<DataSet> trainSet, JavaRDD<DataSet> testSet, int numLab, 
     int batchSize, int iterations, int seed, int listenerFreq) { 

    this.trainSet = trainSet; 
    this.testSet = testSet; 
    this.numLab = numLab; 
    this.batchSize = batchSize; 
    this.iterations = iterations; 
    this.seed = seed; 
    this.listenerFreq = listenerFreq; 
    this.inputSize = testSet.first().numInputs(); 
    this.sc = sc; 



} 

public void build() { 
    System.out.println("input Size: " + inputSize); 
    System.out.println(trainSet.first().toString()); 
    System.out.println(testSet.first().toString()); 

    conf = new NeuralNetConfiguration.Builder().seed(seed) 
      .gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue) 
      .gradientNormalizationThreshold(1.0).iterations(iterations).momentum(0.5) 
      .momentumAfter(Collections.singletonMap(3, 0.9)) 
      .optimizationAlgo(OptimizationAlgorithm.CONJUGATE_GRADIENT).list(4) 
      .layer(0, 
        new RBM.Builder().nIn(inputSize).nOut(500).weightInit(WeightInit.XAVIER) 
          .lossFunction(LossFunction.RMSE_XENT).visibleUnit(RBM.VisibleUnit.BINARY) 
          .hiddenUnit(RBM.HiddenUnit.BINARY).build()) 
      .layer(1, 
        new RBM.Builder().nIn(500).nOut(250).weightInit(WeightInit.XAVIER) 
          .lossFunction(LossFunction.RMSE_XENT).visibleUnit(RBM.VisibleUnit.BINARY) 
          .hiddenUnit(RBM.HiddenUnit.BINARY).build()) 
      .layer(2, 
        new RBM.Builder().nIn(250).nOut(200).weightInit(WeightInit.XAVIER) 
          .lossFunction(LossFunction.RMSE_XENT).visibleUnit(RBM.VisibleUnit.BINARY) 
          .hiddenUnit(RBM.HiddenUnit.BINARY).build()) 
      .layer(3, new OutputLayer.Builder(LossFunction.NEGATIVELOGLIKELIHOOD).activation("softmax").nIn(200) 
        .nOut(numLab).build()) 
      .pretrain(true).backprop(false).build(); 

} 

public void trainModel() { 

    model = new MultiLayerNetwork(conf); 
    model.init(); 
    model.setListeners(Collections.singletonList((IterationListener) new ScoreIterationListener(listenerFreq))); 

    // Create Spark multi layer network from configuration 

    sparkmodel = new SparkDl4jMultiLayer(sc.sc(), model); 
    sparkmodel.fitDataSet(trainSet); 

//Evaluation 
    Evaluation evaluation = sparkmodel.evaluate(testSet); 
    System.out.println(evaluation.stats()); 

有沒有人有關於如何處理我的JavaRDD建議嗎?我相信問題在於那裏。

非常感謝!

EDIT1

我使用deeplearning4j版本0.4 rc.10,並引發1.5.0這裏的堆棧跟蹤

11:03:53,088 ERROR ~ Exception in task 0.0 in stage 16.0 (TID 21 java.lang.IllegalStateException: Network did not have same number of parameters as the broadcasted set parameter 
at org.deeplearning4j.spark.impl.multilayer.evaluation.EvaluateFlatMapFunction.call(EvaluateFlatMapFunction.java:75) 
at org.deeplearning4j.spark.impl.multilayer.evaluation.EvaluateFlatMapFunction.call(EvaluateFlatMapFunction.java:41) 
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:156) 
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:156) 
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:706) 
at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:706) 
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) 
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:297) 
at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) 
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) 
at org.apache.spark.scheduler.Task.run(Task.scala:88) 
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:745) 
Driver stacktrace: 
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1280) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1268) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1267) 
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) 
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1267) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697) 
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697) 
at scala.Option.foreach(Option.scala:236) 
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1493) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1455) 
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1444) 
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567) 
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1813) 
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1933) 
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:1003) 
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) 
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) 
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306) 
at org.apache.spark.rdd.RDD.reduce(RDD.scala:985) 
at org.apache.spark.api.java.JavaRDDLike$class.reduce(JavaRDDLike.scala:375) 
at org.apache.spark.api.java.AbstractJavaRDDLike.reduce(JavaRDDLike.scala:47) 
at org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer.evaluate(SparkDl4jMultiLayer.java:629) 
at org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer.evaluate(SparkDl4jMultiLayer.java:607) 
at org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer.evaluate(SparkDl4jMultiLayer.java:597) 
at deep.deepbeliefclassifier.DeepBeliefNetwork.trainModel(DeepBeliefNetwork.java:117) 
at deep.deepbeliefclassifier.DataInput.main(DataInput.java:105) 
+0

你可以發佈一個堆棧跟蹤的例外來自哪裏? – SpamBot

+0

感謝您的回答,將其發佈到編輯中。 – graffo

+0

你可以先嚐試使用最新版本嗎?這是0.5.0 –

回答

0

你能不能給我們的版本,你'正在使用之類的?我在那裏寫了一些內部的東西。這聽起來像不是正確地發送參數。

參數是1個表示模型係數的長向量。

Deeplearning4j使用向量的參數平均值來跨越執行者來擴展深度學習。還要確保你使用的是kryo。在最新版本中,我們發出警告。

+0

感謝您的答案。我會嘗試克里奧,還沒有聽說過它。那麼我應該以某種方式指定參數的數量嗎? – graffo

+0

不,你不能那樣做。 FWIW,如果你仍然有問題,請參閱我們的鏈接deeplearning4j.org https://deeplearning4j.org/spark –

+0

感謝亞當,我嘗試了新版本,問題不再發生,新的API使用JavaRDDs進行培訓更容易。 – graffo