2017-08-13 273 views
0

我正在面對Tez上的Hive問題。無法在Tez上執行來自Hive的MapReduce作業

我可以選擇一個表上的蜂巢存在沒有任何問題

SELECT * FROM Transactions;

但是當試圖在這個表使用聚合函數或計數(*),如:

SELECT COUNT(*) FROM Transactions;

我在下面登錄Hive.log文件

2017-08-13T10:04:27,892 INFO [4a5b6a0c-9edb-45ea-8d49-b2f4b0d2b636 main] conf.HiveConf:使用傳入的缺省值for log id:4a5b6a0c-9edb-45ea-8d49-b2f4b0d2b636 2017-08 -13T10:04:27,910 INFO [4a5b6a0c-9edb-45ea-8d49-b2f4b0d2b636 main] session.SessionState:關閉tez會話時出錯 java.lang.RuntimeException:java.util.concurrent.ExecutionException:org.apache.tez.dag。 api.SessionNotRunning:TezSession已經關閉。 appattempt_1498057873641_0017_000002由於AM容器失敗而失敗2次exitCode:-1000 失敗此嘗試。診斷:java.io.FileNotFoundException:文件/ tmp/hadoop-hadoop/nm-local-dir/filecache不存在 For更詳細的輸出,請檢查應用程序跟蹤頁面:http://hadoop-master:8090/cluster/app/application_1498057873641_0017然後點擊指向每次嘗試日誌的鏈接。 。申請失敗。 at org.apache.hadoop。org.apache.hadoop.hive.ql.exec.tez.TezSessionState.isOpen(TezSessionState.java:173)〜[hive-exec-2.1.1.jar:2.1.1] 。 hive.ql.exec.tez.TezSessionState.toString(TezSessionState.java:135)〜[hive-exec-2.1.1.jar:2.1.1] at java.lang.String.valueOf(String.java:2994) 〜[?:1.8.0_131] at java.lang.StringBuilder.append(StringBuilder.java:131)〜[?:1.8.0_131] at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager。 closeIfNotDefault(TezSessionPoolManager.java:346)〜[hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.session.SessionState.close(SessionState.java:1524)[hive -exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliSessionState.close(CliSessionState.java:66)[hive-cli-2.1.1.jar:2.1.1] at org.apache.had oop.hive.cli.CliDriver.processCmd(CliDriver.java:133)[hive-cli-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java :399)[hive-cli-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)[hive-cli-2.1.1.jar :2.1.1] ,位於org.apache.hadoop的org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)[hive-cli-2.1.1.jar:2.1.1] 。 hive.cli.CliDriver.main(CliDriver.java:641)[hive-cli-2.1.1.jar:2.1.1] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)〜[?:1.8.0_131] 在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)〜[:?1.8.0_131] 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)〜[:?1.8.0_131] at java.lang.reflect.Method.invoke(Method.java:498)〜[?:1.8.0_131] at org.apache.hadoop.util.RunJar.run(RunJar.java:234)[hadoop-common -2.8.0.jar :?] at org.apache.hadoop.util.RunJar.main(RunJar.java:148)[hadoop-common-2.8.0.jar :?] 引起:java.util。 concurrent.ExecutionException:org.apache.tez.dag.api.SessionNotRunning:TezSession已經關閉。 appattempt_1498057873641_0017_000002使用exitCode退出,因爲AM容器導致應用程序application_1498057873641_0017失敗了2次:-1000 未能完成此嘗試.Diagnostics:java.io.FileNotFoundException:文件/ tmp/hadoop-hadoop/nm-local-dir/filecache不存在 有關更詳細的輸出,請檢查應用程序跟蹤頁面:http://hadoop-master:8090/cluster/app/application_1498057873641_0017然後單擊指向每次嘗試日誌的鏈接。 。申請失敗。 (FutureTask.java:122)〜[?:1.8.0_131] at java.util.concurrent.FutureTask.get(FutureTask.java:206)〜[?:1.8。 0_131] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.isOpen(TezSessionState.java:168)〜[hive-exec-2.1.1.jar:2.1.1] ... 17更多 引起:org.apache.tez.dag.api.SessionNotRunning:TezSession已經關閉。 appattempt_1498057873641_0017_000002由於AM容器失敗而失敗2次exitCode:-1000 失敗此嘗試。診斷:java.io.FileNotFoundException:文件/ tmp/hadoop-hadoop/nm-local-dir/filecache不存在 For更詳細的輸出,請檢查應用程序跟蹤頁面:http://hadoop-master:8090/cluster/app/application_1498057873641_0017然後點擊指向每次嘗試日誌的鏈接。 。申請失敗。 at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:914)〜[tez-api-0.8.4.jar:0.8.4] at org.apache.tez.client.TezClient.waitTillReady( TezClient.java:883)〜[tez-api-0.8.4.jar:0.8.4] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezSessionState.java:416)〜[ hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.access $ 000(TezSessionState.java:97)〜[hive-exec-2.1.1 .jar:2.1.1] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState $ 1.call(TezSessionState.java:333)〜[hive-exec-2.1.1.jar:2.1.1] at org.apache.hadoop.hive.ql.exec.tez.TezSessionState $ 1.call(TezSessionState.java:329)〜[hive-exec-2.1.1.jar:2.1.1] at java.util.concurrent .FutureTask.run(FutureTask.java:266)〜[?:1.8.0_131] 在java.lang.Thread.run(Thread.java:748)〜[:1.8.0_131]

我所有羣集節點上創建錯過目錄解決這個問題「的/ tmp/Hadoop的Hadoop的/納米本地-DIR/filecache」。

2017-08-13T10:06:35567 INFO [主要] optimizer.ColumnPrunerProcFactory:RS 3 oldColExprMap:{當試圖做SELECT COUNT(*) FROM Transactions;,如下面在Hive.log

然後我得到一個錯誤VALUE._col0 = Column [_col0]} 2017-08-13T10:06:35,568 INFO [main] optimizer.ColumnPrunerProcFactory:RS 3 newColExprMap:{VALUE._col0 = Column [_col0]} 2017-08-13T10:06: 35,604 INFO [213ea036-8245-4042-a5a1-ccd686ea2465 main] Configuration.deprecation:已棄用mapred.input.dir.recursive。相反,使用mapreduce.input.fileinputformat.input.dir.recursive 2017-08-13T10:06:35,658 INFO [main] annotation.StatsRulesProcFactory:STATS-GBY [2]:等於0的行數。行將爲0設置爲1 2017-08-13T10:06:35,679 INFO [main] optimizer.SetReducerParallelism:確定的縮減器數目:1 2017-08-13T10:06:35,680 INFO [main] parse.TezCompiler:自由週期: true 2017-08-13T10:06:35,689信息[213ea036-8245-4042-a5a1-ccd686ea2465 main] Configuration.deprecation:mapred.job.name已棄用。相反,使用mapreduce.job.name 2017-08-13T10:06:35,741 INFO [main] parse.CalcitePlanner:已完成計劃生成 2017-08-13T10:06:35,742 INFO [main] ql.Driver:語義分析已完成 2017-08-13T10:06:35,742 INFO [main] ql.Driver:返回Hive架構:架構(fieldSchemas:[FieldSchema(name:c0,type:bigint,comment:null)],屬性:null) 2017- 08-13T10:06:35,744 INFO [main] exec.ListSinkOperator:初始化運算符LIST_SINK [7] 2017-08-13T10:06:35,745 INFO [main] ql.Driver:完成編譯命令(queryId = hadoop_20170813100633_31ca0425-6aca-434c -8039-48bc0e761095);所用時間:2。131秒 2017-08-13T10:06:35,768 INFO [main] ql.Driver:執行命令(queryId = hadoop_20170813100633_31ca0425-6aca-434c-8039-48bc0e761095):從交易中選擇計數(*) 2017-08-13T10: 06:35,768 INFO [main] ql.Driver:Query ID = hadoop_20170813100633_31ca0425-6aca-434c-8039-48bc0e761095 2017-08-13T10:06:35,768 INFO [main] ql.Driver:Total jobs = 1 2017-08- 13T10:06:35,784 INFO [main] ql.Driver:啓動Job 1 out of 1 2017-08-13T10:06:35,784 INFO [main] ql.Driver:以串行模式啓動任務[Stage-1:MAPRED] 2017-08-13T10:06:35,789信息[main] tez.TezSessionPoolManager:當前用戶:hadoop,會話用戶:hadoop 2017-08-13T10:06:35,789 INFO [main] tez.TezSessionPoolManager:當前隊列名稱爲null傳入隊列名稱爲空 2017-08-13T10:06:35,838 INFO [213ea036-8245-4042-a5a1-ccd686ea2465 main] Configuration.deprecation:mapred.committer.job.setup.cleanup.needed已棄用。相反,使用mapreduce.job.committer.setup.cleanup.needed 2017-08-13T10:06:35,840 INFO [main] ql.Context:新的臨時目錄是hdfs:// hadoop-master:8020/tmp/hive/hadoop/213ea036-8245-4042-a5a1-ccd686ea2465/hive_2017-08-13_10-06-33_614_5648783469307420794-1 2017-08-13T10:06:35,845 INFO [main] exec.Task:Session已經打開 2017-08- 13T10:06:35,847 INFO [main] tez.DagUtils:本地化資源,因爲它不存在:file:/opt/apache-tez-0.8.4-bin to dest:hdfs:// hadoop-master:8020/tmp/hive/hadoop/_ez_session_dir/213ea036-8245-4042-a5a1-ccd686ea2465/apache-tez-0.8.4-bin 2017-08-13T10:06:35,850 INFO [main] tez.DagUtils:看起來像另一個線程或進程寫同一個文件 2017-08-13T10:06:35,851 INFO [main] tez.DagUtils:等待文件hdfs:// hadoop-master:8020/tmp/hive/hadoop/_tez_session_dir/213ea036-8245-4042 -a5a1-ccd686ea2465/apache-tez-0.8.4-bin(5次嘗試,間隔5000ms) 2017-08-13T10:07:00,860錯誤[main] tez.DagUtils:找不到正在上傳的jar 2017-08-13T10:07:00,861錯誤[main] exec.Task:無法執行tez圖形。 java.io.IOException:以前的作者可能未能寫出hdfs:// hadoop-master:8020/tmp/hive/hadoop/_tez_session_dir/213ea036-8245-4042-a5a1-ccd686ea2465/apache-tez-0.8.4-bin 。失敗,因爲我不太可能寫。 at org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1022) at org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java: 902) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(TezSessionState。org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:845) java:466) at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:294) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute( (TaskRunner)。 java:100) at org.apache.hadoop.hive.ql.Driver.la unchTask(Driver.java:2073) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1744) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java: 1453) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org。 apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive。在org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) 處, CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.ma in(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl。java:62) at org.apache.hadoop.util.RunJar java.lang.reflect.Method.invoke(Method.java:498) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 。運行(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) 2017-08-13T10:07:00,880錯誤[main] ql.Driver:FAILED:執行錯誤從org.apache.hadoop.hive.ql.exec.tez.TezTask

我經過這個JIRA問題的蜂巢問題「https://issues.apache.org/jira/browse/AMBARI-9821」,而是試圖做計數時,仍然面臨着這樣的錯誤返回碼1(*)從這張桌子。

TEZ的conf文件:

<configuration> 
    <property> 
     <name>tez.lib.uris</name> 
     <value>hdfs://hadoop-master:8020/user/tez/apache-tez-0.8.4-bin/share/tez.tar.gz</value> 
     <type>string</type> 
    </property> 
</configuration> 

蜂巢的conf文件:

<configuration> 
    <property> 
       <name>hive.server2.thrift.http.port</name> 
       <value>10001</value> 
     </property> 
     <property> 
       <name>hive.server2.thrift.http.min.worker.threads</name> 
       <value>5</value> 
     </property> 
     <property> 
       <name>hive.server2.thrift.http.max.worker.threads</name> 
       <value>500</value> 
     </property> 
     <property> 
       <name>hive.server2.thrift.http.path</name> 
       <value>cliservice</value> 
     </property> 
    <property> 
     <name>hive.server2.thrift.min.worker.threads</name> 
     <value>5</value> 
    </property> 
     <property> 
       <name>hive.server2.thrift.max.worker.threads</name> 
       <value>500</value> 
     </property> 
    <property> 
     <name>hive.server2.transport.mode</name> 
     <value>http</value> 
     <description>Server transport mode. "binary" or "http".</description> 
    </property> 
    <property> 
     <name>hive.server2.allow.user.substitution</name> 
     <value>true</value> 
    </property> 
    <property> 
     <name>hive.server2.authentication</name> 
     <value>NONE</value> 
    </property> 
    <property> 
     <name>hive.server2.thrift.bind.host</name> 
     <value>10.100.38.136</value> 
    </property> 
    <property> 
     <name>hive.support.concurrency</name> 
     <description>Enable Hive's Table Lock Manager Service</description> 
     <value>true</value> 
    </property> 
    <property> 
     <name>hive.zookeeper.quorum</name> 
     <description>Zookeeper quorum used by Hive's Table Lock Manager</description> 
     <value>hadoop-master,hadoop-slave1,hadoop-slave2,hadoop-slave3,hadoop-slave4,hadoop-slave5</value> 
    </property> 
    <property> 
     <name>hive.zookeeper.client.port</name> 
     <value>2181</value> 
     <description>The port at which the clients will connect.</description> 
    </property> 
    <property> 
     <name>javax.jdo.option.ConnectionURL</name> 
     <value>jdbc:derby://hadoop-master:1527/metastore_db2</value> 
     <description> 
      JDBC connect string for a JDBC metastore. 
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL. 
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database. 
     </description> 
    </property> 
    <property> 
     <name>hive.metastore.warehouse.dir</name> 
     <value>/user/hive/warehouse</value> 
     <description>location of default database for the warehouse</description> 
    </property> 
    <property> 
       <name>hive.server2.webui.host</name> 
       <value>10.100.38.136</value> 
     </property> 
     <property> 
       <name>hive.server2.webui.port</name> 
       <value>10010</value> 
     </property> 
    <!--<property> 
     <name>hive.metastore.local</name> 
     <value>true</value> 
    </property> 
    <property> 
     <name>hive.metastore.uris</name> 
     <value/> 
     <value>thrift://hadoop-master:9083</value> 
     <value>file:///source/apache-hive-2.1.1-bin/bin/metastore_db/</value> 
     <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description> 
    </property>--> 
    <property> 
     <name>javax.jdo.option.ConnectionDriverName</name> 
     <value>org.apache.derby.jdbc.ClientDriver</value> 
     <description>Driver class name for a JDBC metastore</description> 
    </property> 
    <property> 
     <name>javax.jdo.PersistenceManagerFactoryClass</name> 
     <value>org.datanucleus.api.jdo.JDOPersistenceManagerFactory</value> 
     <description>class implementing the jdo persistence</description> 
    </property> 
    <property> 
     <name>datanucleus.autoStartMechanism</name> 
     <value>SchemaTable</value> 
    </property> 
    <property> 
     <name>hive.execution.engine</name> 
     <value>tez</value> 
    </property> 
    <property> 
     <name>javax.jdo.option.ConnectionUserName</name> 
     <value>APP</value> 
    </property> 
    <property> 
     <name>javax.jdo.option.ConnectionPassword</name> 
     <value>mine</value> 
    </property> 
    <!--<property> 
     <name>datanucleus.autoCreateSchema</name> 
      <value>false</value> 
      <description>Creates necessary schema on a startup if one doesn't exist</description> 
    </property> --> 
</configuration> 

而且這是由紗線診斷:

應用application_1498057873641_0018失敗,原因是2倍到AM的容器appattempt_1498057873641_0018_000002退出exitCode:-103 失敗此嘗試。診斷s:Container [pid = 31779,containerID = container_1498057873641_0018_02_000001]超出了虛擬內存限制。當前使用情況:使用1 GB物理內存169.3 MB;使用2.6 GB的2.1 GB虛擬內存。殺死容器。 轉儲過程樹爲container_1498057873641_0018_02_000001的: | - PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)SYSTEM_TIME(MILLIS)VMEM_USAGE(字節)RSSMEM_USAGE(頁)FULL_CMD_LINE | - 31786 31779 31779 31779(JAVA)587 61 2710179840 43031/opt/jdk-8u131/jdk1.8.0_131/bin/java -Xmx819m -Djava.io.tmpdir =/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1498057873641_0018/container_1498057873641_0018_02_000001/tmp -server - Djava.net.preferIPv4Stack = true -Dhadoop.metrics.log.level = WARN -XX:+ PrintGCDetails -verbose:gc -XX:+ PrintGCTimeStamps -XX:+ UseNUMA -XX:+ UseParallelGC -Dlog4j.configuratorClass = org.apache。 tez.common.TezLog4jConfigurator -Dlog4j.configuration = tez-container-log4j.properties -Dyarn.app.container.log.dir =/opt/hadoop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001 -Dtez.root .logger = INFO,CLA -Dsun .nio.ch.bugLevel = org.apache.tez.dag.app.DAGAppMaster --session | - 31779 31777 31779 31779(bash)0 0 115838976 306/bin/bash -c/opt/jdk-8u131/jdk1。 8.0_131/bin/java -Xmx819m -Djava.io.tmpdir =/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/application_1498057873641_0018/container_1498057873641_0018_02_000001/tmp -server -Djava.net.preferIPv4Stack = true - Dhadoop.metrics.log.level = WARN -XX:+ PrintGCDetails -verbose:gc -XX:+ PrintGCTimeStamps -XX:+ UseNUMA -XX:+ UseParallelGC -Dlog4j.configuratorClass = org.apache.tez.common.TezLog4jConfigurator -Dlog4j。 configuration = tez-container-log4j.properties -Dyarn.app.container.log.dir =/opt/hadoop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001 -Dtez.root.logger = INFO,CLA -Dsun .nio.ch.bugLevel =''org.apache.tez.dag.app.DAGAppMaster --session 1> /opt/hadoop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001/stdout 2>/opt /哈doop/hadoop-2.8.0/logs/userlogs/application_1498057873641_0018/container_1498057873641_0018_02_000001/stderr 根據請求死亡的容器。退出代碼爲143 使用非零退出代碼退出的容器143 有關更詳細的輸出,請檢查應用程序跟蹤頁面:http://hadoop-master:8090/cluster/app/application_1498057873641_0018然後單擊指向每次嘗試日誌的鏈接。 。申請失敗。

回答

1

最有可能您碰到https://issues.apache.org/jira/browse/HIVE-16398。 作爲解決方法,您必須在/ usr/hdp // hive/conf/hive-env中添加以下內容。sh

# Folder containing extra libraries required for hive compilation/execution can be controlled by: 
if [ "${HIVE_AUX_JARS_PATH}" != "" ]; then 
if [ -f "${HIVE_AUX_JARS_PATH}" ]; then 
export HIVE_AUX_JARS_PATH=${HIVE_AUX_JARS_PATH} 
elif [ -d "/usr/hdp/current/hive-webhcat/share/hcatalog" ]; then 
export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar 
fi 
elif [ -d "/usr/hdp/current/hive-webhcat/share/hcatalog" ]; then 
export HIVE_AUX_JARS_PATH=/usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core.jar 
fi