使用spark scala在solr中攝取數據

我想使用scala和spark吸取數據到solr，但是，我的代碼缺少一些東西。例如，我從Hortonworks教程的代碼中獲得了下面的代碼。我使用的是spark 1.6.2，solr 5.2.1，scala 2.10.5。使用spark scala在solr中攝取數據

任何人都可以提供一個可行的代碼片段，以成功地將數據插入solr嗎？

val input_file = "hdfs:///tmp/your_text_file" 
    case class Person(id: Int, name: String) 
    val people_df1 = sc.textFile(input_file).map(_.split(",")).map(p => Person(p(0).trim.toInt, p(1))).toDF() 
    val docs = people_df1.map{doc=> 
    val docx=SolrSupport.autoMapToSolrInputDoc(doc.getAs[Int]("id").toString, doc, null) 
    docx.setField("scala_s", "supercool") 
    docx.setField("name_s", doc.getAs[String]("name")) 

    }

//下面的代碼有一些編譯問題，儘管jar文件包含了這些函數。

SolrSupport.indexDocs("sandbox.hortonworks.com:2181","testsparksolr",10,docs) 
     val solrServer = com.lucidworks.spark.SolrSupport.getSolrServer("http://ambari.asiacell.com:2181") 
     solrServer.setDefaultCollection(" 
testsparksolr") 
    solrServer.commit(false, false)

在此先感謝

來源

2017-05-07 omer

你試過spark-solr？

該庫的主要重點是提供一個乾淨的API來索引文檔到Solr服務器，就像你的情況一樣。

來源

2017-05-07 20:36:47 Zouzias

使用spark scala在solr中攝取數據

回答

相關問題