2013-04-03 79 views
3

我試圖按照簡單的Nutch tutorial上的步驟。這是我第一次使用Nutch。Nutch - 作業失敗 - 錯誤mapred.FileOutputCommitter - Mkdirs無法創建文件

所有這些都很好,直到我執行以下命令:

bin/nutch crawl bin/urls -dir crawl -depth 3 -topN 5 -threads 1 

這給了我下面的錯誤

log4j:ERROR setFile(null,true) call failed 
java.io.FileNotFoundException: /usr/local/nutch/framework/apache-nutch-1.6/logs/hadoop.log (No such file or directory) 
    at java.io.FileOutputStream.open(Native Method) 
    at java.io.FileOutputStream.<init>(FileOutputStream.java:212) 
    at java.io.FileOutputStream.<init>(FileOutputStream.java:136) 
    at org.apache.log4j.FileAppender.setFile(FileAppender.java:290) 
    at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164) 
    at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216) 
    at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257) 
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133) 
    at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:97) 
    at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:689) 
    at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:647) 
    at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:544) 
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:440) 
    at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:476) 
    at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:471) 
    at org.apache.log4j.LogManager.<clinit>(LogManager.java:125) 
    at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73) 
    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242) 
    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254) 
    at org.apache.nutch.crawl.Crawl.<clinit>(Crawl.java:43) 
log4j:ERROR Either File or DatePattern options are not set for appender [DRFA]. 
solrUrl is not set, indexing will be skipped... 
crawl started in: crawl 
rootUrlDir = bin/urls 
threads = 1 
depth = 3 
solrUrl=null 
topN = 5 
Injector: starting at 2013-04-02 19:08:03 
Injector: crawlDb: crawl/crawldb 
Injector: urlDir: bin/urls 
Injector: Converting injected urls to crawl db entries. 
Injector: total number of urls rejected by filters: 0 
Injector: total number of urls injected after normalization and filtering: 1 
Injector: Merging injected urls into crawl db. 
Exception in thread "main" java.io.IOException: Job failed! 
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265) 
    at org.apache.nutch.crawl.Injector.inject(Injector.java:296) 
    at org.apache.nutch.crawl.Crawl.run(Crawl.java:127) 
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) 
    at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) 

我的bin目錄有:

  1. Nutch的

  2. 個爬行

  3. 網址/ seeds.txt

不知道問題出在哪裏。

hadoop.log有以下錯誤:

2013-04-03 17:33:18,370 ERROR mapred.FileOutputCommitter - Mkdirs failed to create file:/usr/local/nutch/framework/apache-nutch-1.6/bin/crawl/crawldb/1971189408/_temporary 

2013-04-03 17:33:21,394 WARN mapred.LocalJobRunner - job_local_0002 

java.io.IOException: The temporary job-output directory file:/usr/local/nutch/framework/apache-nutch-1.6/bin/crawl/crawldb/1971189408/_temporary doesn't exist! 
+0

您的用戶是否有權創建該文件夾/文件? – nimeshjm

+0

謝謝nimeshjm!我用'sudo -E'運行該命令.. FileNotFoundException不見了,但「線程中出現異常」主「java.io.IOException:作業失敗!」仍然是 – change

+0

問題出在我的抓取結果目錄。 – change

回答

0

問題同-dir crawl

您需要提及正確的目錄path/name

+1

它期望指向哪裏?我遇到類似的問題。 – Havnar

相關問題