2016-09-22 178 views
0

相對較新的豬/ hadoop生態系統,並嘗試執行一個簡單的DUMP時遇到一個令人沮喪的問題。我正試圖調用下面的豬腳本(該文件是本地的,而不是HFDS,所以我使用pig -x local打開豬殼)。PIG無法讀取導致作業失敗的本地CSV

REGISTER utils.py USING jython AS utils; 
events = LOAD '../test/events.csv' USING PigStorage(',') AS (patientid:int, eventid:chararray, eventdesc:chararray, timestamp:chararray, value:float); 
events = FOREACH events GENERATE patientid, eventid, ToDate(timestamp, 'yyyy-MM-dd') AS etimestamp, value; 
DUMP events; 

但是,這樣做的時候,我收到以下錯誤消息(下面失敗的工作摘要,完整的PIG堆棧跟蹤底部):

Input(s): Failed to read data from "file:///bootcamp/test/events.csv" 
Output(s): Failed to produce result in "file/tmp/temp/305054006/tmp-908064458" 

豬堆棧跟蹤:

ERROR 1066: Unable to open iterator for alias events. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias events. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 
at org.apache.pig.PigServer.openIterator(PigServer.java:925) 
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:746) 
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372) 
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) 
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) 
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) 
at org.apache.pig.Main.run(Main.java:558) 
at org.apache.pig.Main.main(Main.java:170) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
at java.lang.reflect.Method.invoke(Method.java:606) 
at org.apache.hadoop.util.RunJar.run(RunJar.java:221) 
at org.apache.hadoop.util.RunJar.main(RunJar.java:136) 
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:822) 
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:452) 
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280) 
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390) 
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375) 
at org.apache.pig.PigServer.storeEx(PigServer.java:1034) 
at org.apache.pig.PigServer.store(PigServer.java:997) 
at org.apache.pig.PigServer.openIterator(PigServer.java:910) 
... 13 more 
Caused by: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING 
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:294) 
at org.apache.hadoop.mapreduce.Job.getTaskReports(Job.java:540) 
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getTaskReports(HadoopShims.java:235) 
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:801) 
...20 more 

我已經看到了類似的失敗的工作問題,但遺憾的是,我還沒有設法尋找到目前爲止的解決方案。

編輯:我應該提到,當下面的PIG教程在下面的鏈接,我遇到了同樣的問題。

http://www.sunlab.org/teaching/cse8803/fall2016/lab/hadoop-pig/

+0

查看答案,意外發布爲評論。 – mongolol

回答

0

所以,我發現我能夠做「轉儲」文件如下:

tmp = events 100000; --any int larger than number of rows 
dump tmp; 

我曾見過這裏類似的問題,並能夠通過運行來解決作爲根。

相關問題