2016-12-24 70 views
0

嘗試使用ADAM和Zeppelin進行基因組分析。我不確定我是否正確地做了這件事,但遇到了以下問題。Spark,ADAM和Zeppelin

%dep 
z.reset() 
z.addRepo("Spark Packages Repo").url("http://dl.bintray.com/spark-packages/maven") 
z.load("com.databricks:spark-csv_2.10:1.2.0") 
z.load("mysql:mysql-connector-java:5.1.35") 
z.load("org.bdgenomics.adam:adam-core_2.10:0.20.0") 
z.load("org.bdgenomics.adam:adam-cli_2.10:0.20.0") 
z.load("org.bdgenomics.adam:adam-apis_2.10:0.20.0") 

%spark 

import org.bdgenomics.adam.rdd.ADAMContext._ 
import org.bdgenomics.adam.rdd.ADAMContext 
import org.bdgenomics.adam.projections.{ AlignmentRecordField, Projection } 
import org.apache.spark.SparkContext 
import org.apache.spark.SparkConf 
import org.bdgenomics.adam.rdd.ADAMContext 
import org.bdgenomics.adam.rdd.ADAMContext._ 
import org.bdgenomics.adam.projections.Projection 
import org.bdgenomics.adam.projections.AlignmentRecordField 
import scala.io.Source 
import org.apache.spark.rdd.RDD 
import org.bdgenomics.formats.avro.Genotype 
import scala.collection.JavaConverters._ 
import org.bdgenomics.formats.avro._ 
import org.apache.spark.SparkContext._ 
import org.apache.spark.mllib.linalg.{ Vector => MLVector, Vectors } 
import org.apache.spark.mllib.clustering.{ KMeans, KMeansModel } 

val ac = new ADAMContext(sc) 

,我得到一個錯誤

import org.bdgenomics.adam.rdd.ADAMContext._ 
import org.bdgenomics.adam.rdd.ADAMContext 
import org.bdgenomics.adam.projections.{AlignmentRecordField, Projection} 
import org.apache.spark.SparkContext 
import org.apache.spark.SparkConf 
import org.bdgenomics.adam.rdd.ADAMContext 
import org.bdgenomics.adam.rdd.ADAMContext._ 
import org.bdgenomics.adam.projections.Projection 
import org.bdgenomics.adam.projections.AlignmentRecordField 
import scala.io.Source 
import org.apache.spark.rdd.RDD 
import org.bdgenomics.formats.avro.Genotype 
import scala.collection.JavaConverters._ 
import org.bdgenomics.formats.avro._ 
import org.apache.spark.SparkContext._ 
import org.apache.spark.mllib.linalg.{Vector=>MLVector, Vectors} 
import org.apache.spark.mllib.clustering.{KMeans, KMeansModel} 
res7: org.apache.spark.SparkContext = [email protected] 
<console>:188: error: constructor ADAMContext in class ADAMContext cannot be accessed in class $iwC 
       new ADAMContext(sc) 

任何想法,看看下面的輸出?我是否缺少任何依賴關係? ^

回答

2

根據您使用的版本中的文件ADAMContext.scala。構造函數是私有的。

class ADAMContext private (@transient val sc: SparkContext) 
    extends Serializable with Logging { 
    ... 
} 

你可以改爲使用像這樣。

import org.bdgenomics.adam.rdd.ADAMContext._ 

val adamContext: ADAMContext = z.sc 

它將使用隱式轉換的對象ADAMContext

object ADAMContext { 
    implicit def sparkContextToADAMContext(sc: SparkContext): ADAMContext = 
     new ADAMContext(sc) 
} 
+0

我試過了,對象似乎爲空'%spark val ac:ADAMContext = sc ac:org.bdgenomics.adam.rdd.ADAMContext = null' –

0

它沒有工作,而無需使用Z基準面!

val ac:ADAMContext = sc 
val genotypes: RDD[Genotype] = ac.loadGenotypes("/tmp/ADAM2").rdd 

輸出

ac: org.bdgenomics.adam.rdd.ADAMContext = [email protected] 

genotypes: 
org.apache.spark.rdd.RDD[org.bdgenomics.formats.avro.Genotype] = MapPartitionsRDD[3] at map at ADAMContext.scala:207 

我曾試圖在ADAM-shell提示符下這樣做的,我不記得有使用隱式轉換。但是它使用ADAM的0.19版本。

相關問題