任何人在法蘭克福使用s3使用hadoop/spark 1.6.0?使用S3(法蘭克福)和Spark
我試圖存儲在S3工作的結果,我的依賴性聲明如下:
"org.apache.spark" %% "spark-core" % "1.6.0" exclude("org.apache.hadoop", "hadoop-client"),
"org.apache.spark" %% "spark-sql" % "1.6.0",
"org.apache.hadoop" % "hadoop-client" % "2.7.2",
"org.apache.hadoop" % "hadoop-aws" % "2.7.2"
我已經設置了以下配置:
System.setProperty("com.amazonaws.services.s3.enableV4", "true")
sc.hadoopConfiguration.set("fs.s3a.endpoint", ""s3.eu-central-1.amazonaws.com")
當調用saveAsTextFile
我它啓動RDD,將所有內容保存在S3上。但是一段時間後,當它從_temporary
轉移到最終輸出後導致它產生的錯誤:
Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 403, AWS Service: Amazon S3, AWS Request ID: XXXXXXXXXXXXXXXX, AWS Error Code: SignatureDoesNotMatch, AWS Error Message: The request signature we calculated does not match the signature you provided. Check your key and signing method., S3 Extended Request ID: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at com.amazonaws.services.s3.AmazonS3Client.copyObject(AmazonS3Client.java:1507)
at com.amazonaws.services.s3.transfer.internal.CopyCallable.copyInOneChunk(CopyCallable.java:143)
at com.amazonaws.services.s3.transfer.internal.CopyCallable.call(CopyCallable.java:131)
at com.amazonaws.services.s3.transfer.internal.CopyMonitor.copy(CopyMonitor.java:189)
at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:134)
at com.amazonaws.services.s3.transfer.internal.CopyMonitor.call(CopyMonitor.java:46)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
如果我使用hadoop-client
從火花包它甚至沒有開始傳輸。錯誤隨機發生,有時會起作用,有時不起作用。
似乎與你的SSH密鑰的問題。你能檢查你使用的是正確的鑰匙嗎? – user1314742
數據開始在s3上保存,並在一段時間後出現錯誤。 – flaviotruzzi
@flaviotruzzi你解決了這個問題嗎? – pangpang