2017-09-25 139 views
0

我想執行這個程序,但沒有看到任何輸出在控制檯上寫pprint語句。蟒蛇火花流輸出

from __future__ import print_function 
import sys 
from pyspark import SparkContext 
from pyspark.streaming import StreamingContext 
if __name__ == "__main__": 
    if len(sys.argv) != 2: 
     print("Usage: hdfs_wordcount.py <directory>", file=sys.stderr) 
     exit(-1) 
    sc = SparkContext(appName="PythonStreamingHDFSWordCount") 
    ssc = StreamingContext(sc, 1) 
    lines = ssc.textFileStream(sys.argv[1]) 
    counts = lines.flatMap(lambda line: line.split(" "))\ 
        .map(lambda x: (x, 1))\ 
        .reduceByKey(lambda a, b: a+b) 
    counts.pprint() 
    ssc.start() 
    ssc.awaitTermination() 

https://github.com/apache/spark/blob/master/examples/src/main/python/streaming/hdfs_wordcount.py

回答