2016-04-28 180 views
0

有沒有辦法根據select查詢刪除一些記錄?Spark SQL cassandra刪除記錄

我有這個疑問,

Select min(id) from ID having count(*)>1這將顯示重複。我需要獲得這些ID並刪除它們。我怎麼能在火花sql中做到這一點?

回答

0

Spark SQL不支持DELETE。

import scala.collection.JavaConverters._ 
import scala.collection.JavaConversions._ 
import com.datastax.driver.core.{Cluster, Session, BatchStatement} 
import com.datastax.driver.core.querybuilder.QueryBuilder 

val cluster = Cluster.builder().addContactPoint(host_ip).build() 
val session = cluster.connect(keyspace) 

val idsToDelete = ... // perform your query and collect the ids 

val queries = idsToDelete.map({ id => QueryBuilder.delete().from(keyspace, table).where(QueryBuilder.eq("id", id)) }) 
val batch = batchStatement().addAll(queries.asJava) 
session.execute(batch) 

cluster.close 

如果IDS要刪除的號碼是小,你可以使用Cassandra的驅動程序,而不是通過星火辦呢