2017-09-25 96 views
0

我正在4個節點(3個從機)上建立一個Hadoop集羣,VPC內的所有獨立EC2。大致步驟如下(但安裝Hadoop的2.8.1代替):http://arturmkrtchyan.com/how-to-setup-multi-node-hadoop-2-yarn-clusterHDFS沒有格式化,但沒有錯誤

我格式化名稱節點,這給了以下回應:

$ hdfs namenode -format 
17/09/26 07:05:34 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************ 
STARTUP_MSG: Starting NameNode 
STARTUP_MSG: user = hduser 
STARTUP_MSG: host = ec2-xx-xx-xx-01.eu-central-1.compute.amazonaws.com/10.0.0.190 
STARTUP_MSG: args = [-format] 
STARTUP_MSG: version = 2.8.1 
STARTUP_MSG: classpath = /usr/... 

STARTUP_MSG: build = Unknown -r Unknown; compiled by 'hduser' on 2017-09-22T14:53Z 
STARTUP_MSG: java = 1.8.0_144 
************************************************************/ 
17/09/26 07:07:33 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 
17/09/26 07:07:33 INFO namenode.NameNode: createNameNode [-format] 
Formatting using clusterid: CID-15524170-7dfa-481b-add9-4c2542a55ca5 
17/09/26 07:07:33 INFO namenode.FSEditLog: Edit logging is async:false 
17/09/26 07:07:33 INFO namenode.FSNamesystem: KeyProvider: null 
17/09/26 07:07:33 INFO namenode.FSNamesystem: fsLock is fair: true 
17/09/26 07:07:33 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false 
17/09/26 07:07:33 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 
17/09/26 07:07:33 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=false 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: The block deletion will start around 2017 Sep 26 07:07:33 
17/09/26 07:07:33 INFO util.GSet: Computing capacity for map BlocksMap 
17/09/26 07:07:33 INFO util.GSet: VM type  = 64-bit 
17/09/26 07:07:33 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB 
17/09/26 07:07:33 INFO util.GSet: capacity  = 2^21 = 2097152 entries 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: defaultReplication   = 3 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: maxReplication    = 512 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: minReplication    = 1 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: maxReplicationStreams  = 2 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: encryptDataTransfer  = false 
17/09/26 07:07:33 INFO blockmanagement.BlockManager: maxNumBlocksToLog   = 1000 
17/09/26 07:07:33 INFO namenode.FSNamesystem: fsOwner    = hduser (auth:SIMPLE) 
17/09/26 07:07:33 INFO namenode.FSNamesystem: supergroup   = supergroup 
17/09/26 07:07:33 INFO namenode.FSNamesystem: isPermissionEnabled = false 
17/09/26 07:07:33 INFO namenode.FSNamesystem: HA Enabled: false 
17/09/26 07:07:33 INFO namenode.FSNamesystem: Append Enabled: true 
17/09/26 07:07:34 INFO util.GSet: Computing capacity for map INodeMap 
17/09/26 07:07:34 INFO util.GSet: VM type  = 64-bit 
17/09/26 07:07:34 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB 
17/09/26 07:07:34 INFO util.GSet: capacity  = 2^20 = 1048576 entries 
17/09/26 07:07:34 INFO namenode.FSDirectory: ACLs enabled? false 
17/09/26 07:07:34 INFO namenode.FSDirectory: XAttrs enabled? true 
17/09/26 07:07:34 INFO namenode.NameNode: Caching file names occurring more than 10 times 
17/09/26 07:07:34 INFO util.GSet: Computing capacity for map cachedBlocks 
17/09/26 07:07:34 INFO util.GSet: VM type  = 64-bit 
17/09/26 07:07:34 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB 
17/09/26 07:07:34 INFO util.GSet: capacity  = 2^18 = 262144 entries 
17/09/26 07:07:34 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 
17/09/26 07:07:34 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 
17/09/26 07:07:34 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension  = 30000 
17/09/26 07:07:34 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10 
17/09/26 07:07:34 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10 
17/09/26 07:07:34 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25 
17/09/26 07:07:34 INFO namenode.FSNamesystem: Retry cache on namenode is enabled 
17/09/26 07:07:34 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 
17/09/26 07:07:34 INFO util.GSet: Computing capacity for map NameNodeRetryCache 
17/09/26 07:07:34 INFO util.GSet: VM type  = 64-bit 
17/09/26 07:07:34 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB 
17/09/26 07:07:34 INFO util.GSet: capacity  = 2^15 = 32768 entries 
Re-format filesystem in Storage Directory /usr/local/hadoop/data/namenode ? (Y or N) 
$ Y 
17/09/26 07:09:21 INFO namenode.FSImage: Allocated new BlockPoolId: BP-793961451-10.0.0.190-1506409761821 
17/09/26 07:09:21 INFO common.Storage: Storage directory /usr/local/hadoop/data/namenode has been successfully formatted. 
17/09/26 07:09:21 INFO namenode.FSImageFormatProtobuf: Saving image file /usr/local/hadoop/data/namenode/current/fsimage.ckpt_0000000000000000000 using no compression 
17/09/26 07:09:21 INFO namenode.FSImageFormatProtobuf: Image file /usr/local/hadoop/data/namenode/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds. 
17/09/26 07:09:21 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 
17/09/26 07:09:21 INFO util.ExitUtil: Exiting with status 0 
17/09/26 07:09:21 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************ 
SHUTDOWN_MSG: Shutting down NameNode at ec2-xx-xx-xx-01.eu-central-1.compute.amazonaws.com/10.0.0.190 
************************************************************/ 

當我啓動DFS和紗線它似乎正確啓動:

$ start-dfs.sh 
Starting namenodes on [ec2-xx-xx-xx-01.eu-central-1.compute.amazonaws.com] 
ec2-xx-xx-xx-01.eu-central-1.compute.amazonaws.com: starting namenode, logging to ... 
10.0.0.185: starting datanode, logging to ... 
10.0.0.244: starting datanode, logging to ... 
10.0.0.83: starting datanode, logging to ... 
Starting secondary namenodes [ec2-xx-xx-xx-01.eu-central-1.compute.amazonaws.com] 
ec2-xx-xx-xx-01.eu-central-1.compute.amazonaws.com: starting secondarynamenode, logging to ... 


$ start-yarn.sh 
starting yarn daemons 
starting resourcemanager, logging to ... 
10.0.0.185: starting nodemanager, logging to ... 
10.0.0.83: starting nodemanager, logging to ... 
10.0.0.244: starting nodemanager, logging to ... 

$ jps 
14326 NameNode 
14998 Jps 
14552 SecondaryNameNode 
14729 ResourceManager 

而且對其他節點是這樣的:

15880 Jps 
15563 DataNode 
15693 NodeManager 

但是,當我嘗試將數據寫入HDFS時,它告訴我沒有任何節點實際可用。這似乎是一個非常普遍的錯誤,我無法找到問題所在。

$ hdfs dfs -put pg1661.txt /samples/input 
WARN hdfs.DataStreamer: DataStreamer Exception 
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /samples/input/pg1661.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation. 

然後,當我檢查狀態,它似乎並沒有正常工作:

$ hdfs dfsadmin -report 
Configured Capacity: 0 (0 B) 
Present Capacity: 0 (0 B) 
DFS Remaining: 0 (0 B) 
DFS Used: 0 (0 B) 
DFS Used%: NaN% 
Under replicated blocks: 0 
Blocks with corrupt replicas: 0 
Missing blocks: 0 
Missing blocks (with replication factor 1): 0 
Pending deletion blocks: 0 

我檢查日誌文件,而且它們並不表示任何(致命)的錯誤,除了當試圖上傳文件。

鑑於上述情況在啓動時不會產生任何錯誤,並且錯誤消息本身非常普遍,我發現很難找到錯誤。

回答

0

從您的「hdfs dfsadmin -report」命令輸出中顯示容量爲0.它看起來像您可能忘記格式化namenode。在啓動HDFS之前,您需要在命令之下運行。

hdfs namenode -format 

在此之後「HDFS dfsadmin -report」輸出應該類似於下,

Configured Capacity: 32195477504 (29.98 GB) 
Present Capacity: 29190479872 (27.19 GB) 
DFS Remaining: 29190471680 (27.19 GB) 
DFS Used: 8192 (8 KB) 
DFS Used%: 0.00% 
Under replicated blocks: 0 
Blocks with corrupt replicas: 0 
Missing blocks: 0 
Missing blocks (with replication factor 1): 0 
Pending deletion blocks: 0 

我有以下鏈接單個節點設置視頻教程。希望它能幫助你。這是Hadoop的版本2.8.1,

http://hadooptutorials.info/2017/09/14/hadoop-installation-on-signle-node-cluster/

+0

THX您的回覆。我確實運行了這個命令。響應以'SHUTDOWN_MSG:關閉NameNode在ec2-xx-xx-xx-01.eu-central-1.compute.amazonaws.com//10.0.0.190'結束。這是否表明format命令失敗?它不會給出任何錯誤消息,除非告訴我它關閉了。我會更新這個問題。 – Dendrobates

+0

我包含了我(嘗試)格式化名稱節點時得到的響應。 – Dendrobates

+0

我認爲格式化namenode時,關閉消息是正常的。我想可能是namenode無法SSH進入數據節點。你有沒有將數據節點定義爲單獨的服務器或同一臺服務器?也許你可以先嚐試單節點設置,即同一臺服務器上的namenode和數據節點。一旦工作,嘗試添加其他數據節點。它將隔離一些問題。你也可以與core-site.xml,hdfs-site.xml一起共享你的主人和奴隸文件嗎? –

相關問題