2017-06-13 96 views
0

我有一個scala Spark作業。我想使用Gzip壓縮輸出,然後saveToTextFile。Spark:壓縮並保存到文本文件時出錯

compressedEvents.saveAsTextFile(outputDirectory, org.apache.hadoop.io.compress.GzipCodec) 

,但我得到了以下錯誤:

[error] /var/lib/jenkins/workspace/producer-data-test/producer-data-test-build/src/main/scala/IpFromLogs.scala:46: object org.apache.hadoop.io.compress.GzipCodec is not a value 
[error]  compressedEvents.saveAsTextFile(outputDirectory, org.apache.hadoop.io.compress.GzipCodec) 
[error]                      ^
[error] one error found 
[error] (compile:compileIncremental) Compilation failed 

我試過相同的不同變化,但他們沒有工作。請幫忙!

回答

1

減排的正確方法是

compressedEvents.saveAsTextFile(outputDirectory, classOf[GzipCodec]) 

或者
您保存設置配置爲

sc.hadoopConfiguration.setClass(FileOutputFormat.COMPRESS_CODEC, classOf[GzipCodec], classOf[CompressionCodec]) 

之前,並保存爲

compressedEvents.saveAsTextFile(outputDirectory) 
相關問題