使用Flink並行運行流和批處理環境

在Flink中並行運行流和批處理是否有意義？使用Flink並行運行流和批處理環境

//calculate median using DataSet (Batch Environment) 
BatchFunctions batch = new BatchFunctions(); 
DataSet<Tuple2<Double, Integer>> dataSet1 = batch.loadDataSetOfOctober2016(); 
double median = batch.getMedianReactionTime(dataSet1); 

// now use the calculated median in the DataStream (stream environment) 
StreamFunctions stream = new StreamFunctions(); 
DataStream<Tuple7<String, String, Integer, String, Date, String, List<Double>>> dataStream1 = stream.getKafkaStream(); 
stream.printPredictionForNextReactionTimeByMedians(dataStream1, median, Time.seconds(10)); 
stream.execute();

來源

2016-11-23 lidox

我寧願不這樣做。如果您的流式傳輸過程取決於您的批處理結果。您可以提前獲得批處理結果並將其放入隊列或數據庫表中，流式處理可以從中獲得結果，因此批處理結果更改時無需重新啓動。由於流式處理有些無限。但批處理結果可能會改變，因爲您可能會在不同的輸入上運行它。

來源

2016-11-24 01:55:11

好的感謝您的意見 – lidox

使用Flink並行運行流和批處理環境

回答

相關問題