2016-03-18 56 views
2

我想了解如何從我的本地機器連接到HDFS(在AWS EMR)從本地機器上的Java代碼在AWS EMR連接HDFS

我的示例程序

public class EMRConnection { 


public static void main(String[] args) throws IOException, URISyntaxException { 
    Configuration config = new Configuration(); 
    FileSystem hdfs = FileSystem.get(new URI("hdfs://***-**-**-***-***.compute-1.amazonaws.com:50070"), config); 
    hdfs.mkdirs(new Path("/user/test/")); 

} 

}

我已驗證並授權EMR接受來自我的IP的連接。我得到以下例外

Exception in thread "main" java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "{xyz}"; destination host is: "ec2-(xyz...).amazonaws.com":50070; 
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) 
at org.apache.hadoop.ipc.Client.call(Client.java:1351) 
at org.apache.hadoop.ipc.Client.call(Client.java:1300) 
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) 
at com.sun.proxy.$Proxy7.mkdirs(Unknown Source) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) 
at java.lang.reflect.Method.invoke(Unknown Source) 
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) 
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) 
at com.sun.proxy.$Proxy7.mkdirs(Unknown Source) 
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:467) 
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2394) 
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2365) 
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:817) 
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:813) 
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) 
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:813) 
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:806) 
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1933) 
at EMRConnection.main(EMRConnection.java:16) 
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag. 
    at com.google.protobuf.InvalidProtocolBufferException.invalidEndTag(InvalidProtocolBufferException.java:94) 
    at com.google.protobuf.CodedInputStream.checkLastTagWas(CodedInputStream.java:124) 
    at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:202) 
    at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241) 
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253) 
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259) 
    at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49) 
    at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364) 
    at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:996) 
    at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891) 

有人請讓我知道我應該如何連接?我使用錯誤的IP或端口?

我發現,端口應該是8020。在此之後,但是我可以創建文件夾時,我試圖寫一個文件 它拋出異常

could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation. 
+0

EMR連接現在正在工作,但如果我試圖寫一個文件它說 – Sam

回答

0

下面當我用端口8020。它的工作。與其他人分享可能會面臨同樣的情況