hadoop字數示例

我是hadoop的新手，我剛剛安裝了hadoop 2.6。hadoop字數示例

它似乎系統開始OK。我試圖運行字數exmaple和ht問題是，everthing似乎運行，輸出文件夾創建了2個文件：

-rw-r - r-- 1 yoni supergroup 0 2016-04- 30 02:11/user/yoni/output100/_SUCCESS -rw-r - r-- 1 yoni supergroup 0 2016-04-30 02:11/user/yoni/output100/part -r-00000

但文件是空的部分-r-00000。問題是我不知道是去找找問題，

這是作業的日誌：

16/04/30 20:30:33 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 
16/04/30 20:30:34 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 
16/04/30 20:30:34 INFO input.FileInputFormat: Total input paths to process : 1 
16/04/30 20:30:34 INFO mapreduce.JobSubmitter: number of splits:1 
16/04/30 20:30:34 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461971181442_0005 
16/04/30 20:30:34 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources. 
16/04/30 20:30:34 INFO impl.YarnClientImpl: Submitted application application_1461971181442_0005 
16/04/30 20:30:34 INFO mapreduce.Job: The url to track the job: http://yoni-Lenovo-Z40-70:8088/proxy/application_1461971181442_0005/ 
16/04/30 20:30:34 INFO mapreduce.Job: Running job: job_1461971181442_0005 
16/04/30 20:30:41 INFO mapreduce.Job: Job job_1461971181442_0005 running in uber mode : false 
16/04/30 20:30:41 INFO mapreduce.Job: map 0% reduce 0% 
16/04/30 20:30:46 INFO mapreduce.Job: map 100% reduce 0% 
16/04/30 20:30:51 INFO mapreduce.Job: map 100% reduce 100% 
16/04/30 20:30:52 INFO mapreduce.Job: Job job_1461971181442_0005 completed successfully 
16/04/30 20:30:52 INFO mapreduce.Job: Counters: 49 
    File System Counters 
     FILE: Number of bytes read=6 
     FILE: Number of bytes written=211511 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=170 
     HDFS: Number of bytes written=86 
     HDFS: Number of read operations=6 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=2 
    Job Counters 
     Launched map tasks=1 
     Launched reduce tasks=1 
     Data-local map tasks=1 
     Total time spent by all maps in occupied slots (ms)=2923 
     Total time spent by all reduces in occupied slots (ms)=2526 
     Total time spent by all map tasks (ms)=2923 
     Total time spent by all reduce tasks (ms)=2526 
     Total vcore-seconds taken by all map tasks=2923 
     Total vcore-seconds taken by all reduce tasks=2526 
     Total megabyte-seconds taken by all map tasks=2993152 
     Total megabyte-seconds taken by all reduce tasks=2586624 
    Map-Reduce Framework 
     Map input records=1 
     Map output records=0 
     Map output bytes=0 
     Map output materialized bytes=6 
     Input split bytes=116 
     Combine input records=0 
     Combine output records=0 
     Reduce input groups=0 
     Reduce shuffle bytes=6 
     Reduce input records=0 
     Reduce output records=0 
     Spilled Records=0 
     Shuffled Maps =1 
     Failed Shuffles=0 
     Merged Map outputs=1 
     GC time elapsed (ms)=166 
     CPU time spent (ms)=1620 
     Physical memory (bytes) snapshot=426713088 
     Virtual memory (bytes) snapshot=3818450944 
     Total committed heap usage (bytes)=324009984 
    Shuffle Errors 
     BAD_ID=0 
     CONNECTION=0 
     IO_ERROR=0 
     WRONG_LENGTH=0 
     WRONG_MAP=0 
     WRONG_REDUCE=0 
    File Input Format Counters 
     Bytes Read=54 
    File Output Format Counters 
     Bytes Written=86 
16/04/30 20:30:52 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 
16/04/30 20:30:52 WARN mapreduce.JobSubmitter: No job jar file set. User classes may not be found. See Job or Job#setJar(String). 
16/04/30 20:30:52 INFO input.FileInputFormat: Total input paths to process : 1 
16/04/30 20:30:52 INFO mapreduce.JobSubmitter: number of splits:1 
16/04/30 20:30:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1461971181442_0006 
16/04/30 20:30:52 INFO mapred.YARNRunner: Job jar is not present. Not adding any jar to the list of resources. 
16/04/30 20:30:52 INFO impl.YarnClientImpl: Submitted application application_1461971181442_0006 
16/04/30 20:30:52 INFO mapreduce.Job: The url to track the job: http://yoni-Lenovo-Z40-70:8088/proxy/application_1461971181442_0006/ 
16/04/30 20:30:52 INFO mapreduce.Job: Running job: job_1461971181442_0006 
16/04/30 20:31:01 INFO mapreduce.Job: Job job_1461971181442_0006 running in uber mode : false 
16/04/30 20:31:01 INFO mapreduce.Job: map 0% reduce 0% 
16/04/30 20:31:07 INFO mapreduce.Job: map 100% reduce 0% 
16/04/30 20:31:12 INFO mapreduce.Job: map 100% reduce 100% 
16/04/30 20:31:13 INFO mapreduce.Job: Job job_1461971181442_0006 completed successfully 
16/04/30 20:31:13 INFO mapreduce.Job: Counters: 49 
    File System Counters 
     FILE: Number of bytes read=6 
     FILE: Number of bytes written=210495 
     FILE: Number of read operations=0 
     FILE: Number of large read operations=0 
     FILE: Number of write operations=0 
     HDFS: Number of bytes read=216 
     HDFS: Number of bytes written=0 
     HDFS: Number of read operations=7 
     HDFS: Number of large read operations=0 
     HDFS: Number of write operations=2 
    Job Counters 
     Launched map tasks=1 
     Launched reduce tasks=1 
     Data-local map tasks=1 
     Total time spent by all maps in occupied slots (ms)=3739 
     Total time spent by all reduces in occupied slots (ms)=3133 
     Total time spent by all map tasks (ms)=3739 
     Total time spent by all reduce tasks (ms)=3133 
     Total vcore-seconds taken by all map tasks=3739 
     Total vcore-seconds taken by all reduce tasks=3133 
     Total megabyte-seconds taken by all map tasks=3828736 
     Total megabyte-seconds taken by all reduce tasks=3208192 
    Map-Reduce Framework 
     Map input records=0 
     Map output records=0 
     Map output bytes=0 
     Map output materialized bytes=6 
     Input split bytes=130 
     Combine input records=0 
     Combine output records=0 
     Reduce input groups=0 
     Reduce shuffle bytes=6 
     Reduce input records=0 
     Reduce output records=0 
     Spilled Records=0 
     Shuffled Maps =1 
     Failed Shuffles=0 
     Merged Map outputs=1 
     GC time elapsed (ms)=125 
     CPU time spent (ms)=1010 
     Physical memory (bytes) snapshot=427823104 
     Virtual memory (bytes) snapshot=3819626496 
     Total committed heap usage (bytes)=324534272 
    Shuffle Errors 
     BAD_ID=0 
     CONNECTION=0 
     IO_ERROR=0 
     WRONG_LENGTH=0 
     WRONG_MAP=0 
     WRONG_REDUCE=0 
    File Input Format Counters 
     Bytes Read=86 
    File Output Format Counters 
     Bytes Written=0

我正在運行自帶的Hadoop的安裝目錄的單詞計數例子

hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep/user/yoni/input/user/yoni/output101'dfs [az。] +'

and the setup在僞分佈式模式中像所有基本的軟件一樣

來源

2016-04-29 Yonihu

我不認爲'grep的/用戶/約尼/輸入/用戶/約尼/ output101 'DFS [A-Z] +''是你的罐子有效的參數。如果是，那麼，如果grep沒有返回任何東西，那麼，是的，你會得到一個空的結果 –

根據計數器，你的工作收到單輸入記錄（'Map Input Records = 1'），沒有發現任何東西匹配給定模式（'Map output records = 0'）。這就是爲什麼你得到空輸出（'減少輸出記錄= 0'）。 '_SUCCESS'意味着hadoop框架能夠完成你的工作，僅此而已。 'part-xxxxx'文件的數量等於減速器的數量。如果相應的減速器沒有產生任何輸出記錄，它們每個都可能是空的。 – gudok

在這個例子中，你應該把所有的xml文件放在hadoop-2.6.4/etc/hadoop到HDFS這個名爲'input'的文件夾中，放在正確的用戶主目錄下，這裏是yoni。

因此，首先通過探索http://localhost:50070（默認情況下）來檢查您的HDFS守護程序進程狀態。

其次，請檢查您的文件狀態bin/hdfs dfs -ls /user/yoni/input或bin/hdfs fsck/-files -blocks。

如果一切順利，它應該工作。

Hadoop MapReduce Next Generation - Setting up a Single Node Cluster

來源

2016-05-01 03:15:30

hadoop字數示例

回答

相關問題