2017-09-23 50 views
0

在AWS EC2上成功配置Hadoop集羣后,至少在每種類型的節點上發出jps命令都會引發以下輸出:Word計數作業在Hadoop中掛起:編譯,提交,接受並且永不終止

​​

同理:

2753 NodeManager 
2614 DataNode 
3051 Jps 

繼標準的Apache教程創建一個字計數程序我已經完成了所有必需的步驟,編譯Java類還有.jar,爲described here

然而,當我用下面的命令執行程序:

$HADOOP_HOME/bin/hadoop jar wc.jar WordCount /user/wordcount /user/output2 

的工作只是我的控制檯上顯示以下輸出掛起:

enter image description here

管理Web界面顯示以下信息:

enter image description here

也許這與我的yarn有關?

在創建這個環境中,我主要關注了這個tutorial

這裏是我是如何安排我的配置文件:

yarn-site.xml

<configuration> 
    <property> 
     <name>yarn.scheduler.minimum-allocation-mb</name> 
     <value>128</value> 
     <description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description> 
    </property> 
    <property> 
     <name>yarn.scheduler.maximum-allocation-mb</name> 
     <value>2048</value> 
     <description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description> 
    </property> 
    <property> 
     <name>yarn.scheduler.minimum-allocation-vcores</name> 
     <value>1</value> 
     <description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description> 
    </property> 
    <property> 
     <name>yarn.scheduler.maximum-allocation-vcores</name> 
     <value>2</value> 
     <description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description> 
    </property> 
    <property> 
     <name>yarn.nodemanager.resource.memory-mb</name> 
     <value>4096</value> 
     <description>Physical memory, in MB, to be made available to running containers</description> 
    </property> 
    <property> 
     <name>yarn.nodemanager.resource.cpu-vcores</name> 
     <value>4</value> 
     <description>Number of CPU cores that can be allocated for containers.</description> 
    </property> 
</configuration> 

mapred-site.xml

<configuration> 
    <property> 
    <name>mapreduce.framework.name</name> 
    <value>yarn</value> 
    </property> 
    <property> 
    <name>mapreduce.jobhistory.address</name> 
    <value>master:10020</value> 
    </property> 
    <property> 
    <name>mapreduce.jobhistory.webapp.address</name> 
    <value>master:19888</value> 
    </property> 
    <property> 
    <name>yarn.app.mapreduce.am.staging-dir</name> 
    <value>/user/app</value> 
    </property> 
    <property> 
    <name>mapred.child.java.opts</name> 
    <value>-Djava.security.egd=file:/dev/../dev/urandom</value> 
    </property> 
</configuration> 

hdfs-site.xml

<configuration> 
    <property> 
    <name>dfs.namenode.name.dir</name> 
    <value>file:/usr/local/hadoop_work/hdfs/namenode</value> 
    </property> 
    <property> 
    <name>dfs.datanode.data.dir</name> 
    <value>file:/usr/local/hadoop_work/hdfs/datanode</value> 
    </property> 
    <property> 
    <name>dfs.namenode.checkpoint.dir</name> 
    <value>file:/usr/local/hadoop_work/hdfs/namesecondary</value> 
    </property> 
    <property> 
    <name>dfs.replication</name> 
    <value>2</value> 
    </property> 
    <property> 
    <name>dfs.secondary.http.address</name> 
    <value>172.31.46.85:50090</value> 
    </property> 
</configuration> 

core-site.xml

<configuration> 
    <property> 
    <name>fs.defaultFS</name> 
    <value>hdfs://master:8020/</value> 
    </property> 
    <property> 
    <name>fs.default.name</name> 
    <value>hdfs://master:9000/</value> 
    </property> 
    <property> 
    <name>hadoop.tmp.dir</name> 
    <value>/tmp</value> 
    <description>A base for other temporary directories.</description> 
    </property> 
</configuration> 

也許是看到我的~/.bashrc是如何配置的,除了樣板是很重要的,它看起來像這樣:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 
export PATH=${JAVA_HOME}/jre/lib:${PATH} 
export HADOOP_CLASSPATH=${JAVA_HOME}/lib/tools.jar 

# export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 
# adding support for jre 
export PATH=$PATH:$JAVA_HOME/jre/bin 
export HADOOP_HOME=/usr/local/hadoop 
export PATH=$PATH:$HADOOP_HOME/bin 
export PATH=$PATH:$HADOOP_HOME/sbin 
export HADOOP_MAPRED_HOME=$HADOOP_HOME 
export HADOOP_COMMON_HOME=$HADOOP_HOME 
export HADOOP_HDFS_HOME=$HADOOP_HOME 
export YARN_HOME=$HADOOP_HOME 
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native 
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" 
export CLASSPATH=$CLASSPATH:/usr/local/hadoop/lib/*:. 

#trying to get datanode to work :/ 
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop 

export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.egd=file:/dev/../dev/urandom" 
+0

檢查東西日誌! – owaishanif786

回答

0

確保您刪除所有文件夾中的位置:

/usr/local/hadoop_work/hdfs/namenode/ 
/usr/local/hadoop_work/hdfs/datanode 
/usr/local/hadoop_work/hdfs/namesecondary 

通常只需要沿着rm -rf current/的方向行事。

相應的配置:

紗現場。XML

<configuration> 
    <property> 
    <name>yarn.nodemanager.aux-services</name> 
    <value>mapreduce_shuffle</value> 
    </property> 
    <property> 
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 
    <value>org.apache.hadoop.mapred.ShuffleHandler</value> 
    </property> 
    <property> 
    <name>yarn.resourcemanager.hostname</name> 
    <value>master</value> 
    </property> 
</configuration> 

原來,設置yarn.resourcemanager.hostname是非常重要的,這是我絆倒了一段:/

核心的site.xml

<configuration> 
    <property> 
    <name>fs.defaultFS</name> 
    <value>hdfs://master:9000</value> 
    </property> 
</configuration> 

mapred現場.xml

<configuration> 
    <property> 
    <name>mapreduce.framework.name</name> 
    <value>yarn</value> 
    </property> 
</configuration> 

HDFS-site.xml中

<configuration> 
    <property> 
    <name>dfs.replication</name> 
    <value>1</value> 
    </property> 
    <property> 
    <name>dfs.namenode.name.dir</name> 
    <value>file:/usr/local/hadoop_work/hdfs/namenode</value> 
    </property> 
    <property> 
    <name>dfs.namenode.checkpoint.dir</name> 
    <value>file:/usr/local/hadoop_work/hdfs/namesecondary</value> 
    </property> 
    <property> 
    <name>dfs.datanode.data.dir</name> 
    <value>file:/usr/local/hadoop_work/hdfs/datanode</value> 
    </property> 
    <property> 
    <name>dfs.secondary.http.address</name> 
    <value>172.31.46.85:50090</value> 
    </property> 
</configuration> 

/etc/hosts中

666.13.46.70 master 
666.13.35.80 slave1 
666.13.43.131 slave2 

從本質上講,你想在看這個:

enter image description here

執行命令...

很簡單教程:

hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount /input /output 

對於這個example

$HADOOP_HOME/bin/hadoop jar wc.jar WordCount /input /output