0
我創建了一個包含5個分區的Kafka主題。我正在使用如下所示的createStream接收器API。但不知何故,只有一個接收器獲取輸入數據。其餘的接收器沒有任何處理。你能幫忙嗎?Spark Kafka Receiver不從所有分區中挑選數據
JavaPairDStream<String, String> messages = null;
if(sparkStreamCount > 0){
// We create an input DStream for each partition of the topic, unify those streams, and then repartition the unified stream.
List<JavaPairDStream<String, String>> kafkaStreams = new ArrayList<JavaPairDStream<String, String>>(sparkStreamCount);
for (int i = 0; i < sparkStreamCount; i++) {
kafkaStreams.add(KafkaUtils.createStream(jssc, contextVal.getString(KAFKA_ZOOKEEPER), contextVal.getString(KAFKA_GROUP_ID), kafkaTopicMap));
}
messages = jssc.union(kafkaStreams.get(0), kafkaStreams.subList(1, kafkaStreams.size()));
}
else{
messages = KafkaUtils.createStream(jssc, contextVal.getString(KAFKA_ZOOKEEPER), contextVal.getString(KAFKA_GROUP_ID), kafkaTopicMap);
}
將更改後我收到下列情況除外:
INFO : org.apache.spark.streaming.kafka.KafkaReceiver - Connected to localhost:2181
INFO : org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Stopping receiver with message: Error starting receiver 0: java.lang.AssertionError: assertion failed
INFO : org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Called receiver onStop
INFO : org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Deregistering receiver 0
ERROR: org.apache.spark.streaming.scheduler.ReceiverTracker - Deregistered receiver for stream 0: Error starting receiver 0 - java.lang.AssertionError: assertion failed
at scala.Predef$.assert(Predef.scala:165)
at kafka.consumer.TopicCount$$anonfun$makeConsumerThreadIdsPerTopic$2.apply(TopicCount.scala:36)
at kafka.consumer.TopicCount$$anonfun$makeConsumerThreadIdsPerTopic$2.apply(TopicCount.scala:34)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.consumer.TopicCount$class.makeConsumerThreadIdsPerTopic(TopicCount.scala:34)
at kafka.consumer.StaticTopicCount.makeConsumerThreadIdsPerTopic(TopicCount.scala:100)
at kafka.consumer.StaticTopicCount.getConsumerThreadIdsPerTopic(TopicCount.scala:104)
at kafka.consumer.ZookeeperConsumerConnector.consume(ZookeeperConsumerConnector.scala:198)
at kafka.consumer.ZookeeperConsumerConnector.createMessageStreams(ZookeeperConsumerConnector.scala:138)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:111)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:542)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:532)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1986)
at org.apache.spark.SparkContext$$anonfun$37.apply(SparkContext.scala:1986)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
INFO : org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Stopped receiver 0
INFO : org.apache.spark.streaming.receiver.BlockGenerator - Stopping BlockGenerator
INFO : org.apache.spark.streaming.util.RecurringTimer - Stopped timer for BlockGenerator after time 1473964037200
INFO : org.apache.spark.streaming.receiver.BlockGenerator - Waiting for block pushing thread to terminate
INFO : org.apache.spark.streaming.receiver.BlockGenerator - Pushing out the last 0 blocks
INFO : org.apache.spark.streaming.receiver.BlockGenerator - Stopped block pushing thread
INFO : org.apache.spark.streaming.receiver.BlockGenerator - Stopped BlockGenerator
INFO : org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Waiting for receiver to be stopped
ERROR: org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Stopped receiver with error: java.lang.AssertionError: assertion failed
ERROR: org.apache.spark.executor.Executor - Exception in task 0.0 in stage 29.0
非常感謝Hokam對你的反應。我做了你的建議的變化,但現在它不處理任何事情:只要給以下消息: – Alchemist
信息:org.apache.spark.streaming.receiver.BlockGenerator - 推塊輸入2-1473954491600 信息:org.apache.spark - .storage.MemoryStore調用curMem = 487780,MAXMEM = 556038881 INFO ensureFreeSpace(213):org.apache.spark.storage.MemoryStore - 塊輸入2-1473954496800存儲在內存中的字節(估計大小213.0 B,自由529.8 MB) INFO:org.apache.spark.storage.BlockManagerInfo - 增加輸入2-1473954496800在存儲器在localhost:53678(尺寸:213.0 B,自由:530.2 MB) WARN:org.apache.spark.storage.BlockManager - Block input-2-1473954496800 replicated to only 0 peer(s) – Alchemist
Map kafkaTopicMap = new HashMap (); kafkaTopicMap.put(contextVal.getString(KAFKA_HPD_TOPICS),sparkStreamCount);列表> kafkaStreams =新的ArrayList >(sparkStreamCount); \t對(INT I = 0;我
Alchemist