2015-02-11 71 views
3

我在我的輸入中有datetime數據,並想從Pig中正確加載它。我搜索了一下,知道它建議加載爲chararray,然後使用ToDate函數轉換爲datetime。但是,相同的腳本適用於一種輸入,但不適用於另一種輸入,具有相同的數據格式。我的豬版本是0.12.1。我使用的腳本:豬的ToDate函數中的錯誤

A = load '/user/ss/debug/debug' using PigStorage(',') as (AUDIT:chararray,JOB:chararray,TYPE:chararray,ID:long,STATUS_ID:long,POOL_NAME:chararray,SLA_PRIORITY:long,STATUS:chararray,RUN_ID:long,TASK:chararray,SCENARIO_ID:long,CREDIT_CNT:long,COMM_CNT:long,BONUS_CNT:long,PAYMENT_CNT:long,RUN_TIME:long,START_TIME:chararray,END_TIME:chararray,ITEM_COUNT:long); 

B = foreach A generate JOB, TYPE, ID, CREDIT_CNT, COMM_CNT, BONUS_CNT, PAYMENT_CNT, ToDate(START_TIME, 'yyyy-MM-dd HH:mm:ss') as (START_TIME_DT:datetime), ToDate(END_TIME, 'yyyy-MM-dd HH:mm:ss') as (END_TIME_DT:datetime), START_TIME, END_TIME, ITEM_COUNT; 

dump B; 

的數據看起來像以下:

輸入該報告錯誤:即正確運行

D789FD70FE9E3ABBE0432165880A09E1,D789FD70FE9D3ABBE0432165880A09E1,VA,123,4946586,DEFAULT,1,Completed,,DD13,,0,0,0,0,0,2013-03-10 02:41:14,2013-03-10 02:41:16,0 

輸入:

C888E618A7740A71E0432165880ABCA3,C888E618A7730A71E0432165880ABCA3,VA,123,4680120,DEFAULT,1,Completed,,DD12,,0,0,0,0,0,2012-08-31 04:16:56,2012-08-31 04:17:02,0 
C888FC5DA4B212F3E0432165880A3C34,C888FC5DA4B112F3E0432165880A3C34,VA,123,4680125,DEFAULT,1,Completed,,DD12,,0,0,0,0,0,2012-08-31 04:17:51,2012-08-31 04:17:57,0 
C888FC5DA4B912F3E0432165880A3C34,C888FC5DA4B812F3E0432165880A3C34,VA,123,4680127,DEFAULT,1,Completed,,DD14,,0,0,0,0,0,2012-08-31 04:18:17,2012-08-31 04:18:22,0 

我d不明白爲什麼相同的輸入模式和腳本可能會有不同的結果。錯誤說「無法解析」2013-03-10 02:41:14「:由於時區偏移轉換非法即時(美國/洛杉磯)」。

錯誤日誌看起來像以下:

Backend error message 
 
--------------------- 
 
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298) 
 
\t at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282) 
 
\t at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) 
 
\t at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) 
 
\t at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
 
\t at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:707) 
 
\t at org.apache.hadoop.mapred.MapTask.run(MapTask.java:352) 
 
\t at org.apache.hadoop.mapred.Child$4.run(Child.java:270) 
 
\t at java.security.AccessController.doPrivileged(Native Method) 
 
\t at javax.security.auth.Subject.doAs(Subject.java:415) 
 
\t at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) 
 
\t at org.apache.hadoop.mapred.Child.main(Child.java:264) 
 
Caused by: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles) 
 
\t at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:336) 
 
\t at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:672) 
 
\t at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:45) 
 
\t at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:33) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDateTime(POUserFunc.java:422) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:329) 
 
\t ... 13 more 
 

 
Pig Stack Trace 
 
--------------- 
 
ERROR 1066: Unable to open iterator for alias C. Backend error : Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles) 
 

 
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias C. Backend error : Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles) 
 
\t at org.apache.pig.PigServer.openIterator(PigServer.java:870) 
 
\t at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774) 
 
\t at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372) 
 
\t at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:198) 
 
\t at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) 
 
\t at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) 
 
\t at org.apache.pig.Main.run(Main.java:541) 
 
\t at org.apache.pig.Main.main(Main.java:156) 
 
\t at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
 
\t at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
 
\t at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
 
\t at java.lang.reflect.Method.invoke(Method.java:606) 
 
\t at org.apache.hadoop.util.RunJar.main(RunJar.java:197) 
 
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.ToDate2ARGS)[datetime] - scope-120 Operator Key: scope-120) children: null at []]: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298) 
 
\t at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282) 
 
\t at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277) 
 
\t at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) 
 
\t at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
 
\t at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:707) 
 
\t at org.apache.hadoop.mapred.MapTask.run(MapTask.java:352) 
 
\t at org.apache.hadoop.mapred.Child$4.run(Child.java:270) 
 
\t at java.security.AccessController.doPrivileged(Native Method) 
 
\t at javax.security.auth.Subject.doAs(Subject.java:415) 
 
\t at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) 
 
\t at org.apache.hadoop.mapred.Child.main(Child.java:264) 
 
Caused by: java.lang.IllegalArgumentException: Cannot parse "2013-03-10 02:41:14": Illegal instant due to time zone offset transition (America/Los_Angeles) 
 
\t at org.joda.time.format.DateTimeParserBucket.computeMillis(DateTimeParserBucket.java:336) 
 
\t at org.joda.time.format.DateTimeFormatter.parseDateTime(DateTimeFormatter.java:672) 
 
\t at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:45) 
 
\t at org.apache.pig.builtin.ToDate2ARGS.exec(ToDate2ARGS.java:33) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDateTime(POUserFunc.java:422) 
 
\t at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:329)

任何幫助或建議將不勝感激。非常感謝!

+0

對於誰發現這個職位的人找[錯誤1066:無法打開別名迭代器]時(http://stackoverflow.com/questions/34495085/error-1066-unable-to-open-iterator對於別名豬通用解決方案)這裏是[通用解決方案](http://stackoverflow.com/a/34495086/983722)。 – 2015-12-28 14:32:50

回答

3

它的外表不像時區中的'America/Los_Angeles'那樣不存在datetime "2013-03-10 02:41:14"。這可能是由於美國的日光節約時間。相同的輸入在我的時區中工作正常,因此要解決此問題,您需要指定時區'America/Los_Angeles'作爲ToDate函數中的第三個參數。

你可以像這樣改變ToDate功能嗎?

ToDate(START_TIME, 'yyyy-MM-dd HH:mm:ss','America/Los_Angeles')