2016-09-21 115 views
2

我得到的Hadoop集羣LeaseExpiredException -爲什麼在HDFS Hadoop集羣扔LeaseExpiredException(AWS EMR)

尾-f /無功/日誌/ Hadoop的HDFS/Hadoop的HDFS-的NameNode-IP-172-30 -2-148.log

2016年9月21日11:54:14533 INFO BlockStateChange(在8020 IPC服務器處理器10 ):BLOCK * InvalidateBlocks:添加blk_1073747501_6677到 172.30.2.189:50010 2016- 09-21 11:54:14,534 INFO org.apache.hadoop.ipc.Server(8020上的IPC服務器處理程序31):IPC 服務器手請致電 org.apache.hadoop.hdfs.protocol.ClientProtocol.complete從 172.30.2.189:37674呼叫#34重試#0:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException:無租約 on /tmp/hive/hadoop/_tez_session_dir/1e4f71f0-9f29-468d-980e-9f19690bf849/.tez/application_1474442135017_0114/recovery/1/summary (inode 26350):文件不存在。 Holder DFSClient_NONMAPREDUCE_-143782605_1沒有任何打開的文件。 2016年9月21日11:54:15557 INFO org.apache.hadoop.hdfs.StateChange(8020上IPC 服務器處理器0):BLOCK *分配 blk_1073747503_6679 {UCState = UNDER_CONSTRUCTION,truncateBlock = NULL, primaryNodeIndex = -1 /var/log/hadoop-yarn/apps/hadoop/logs [0126] [0128] [0] [0]複製品= [ReplicaUC [[DISK] DS-86592ba7-c51a-431d-8019-9e362d721b28:NORMAL:172.30.2.189:50010 | RBW] /application_1474442135017_0114/ip-172-30-2-122.us-west-2.compute.internal_8041.tmp

而且,一些蜂箱查詢也是失敗的。我猜測,這是因爲上述問題。

尾-f /var/log/hive/hive-server2.log

2016-09-21T11:59:35,126 INFO [HiveServer2-Background-Pool: Thread-3883([])]: ql.Driver (Driver.java:execute(1477)) - Executing command(queryId=hive_20160921115934_c56d9c91-640b-4f5d-b490-34549a4258c7): 
INSERT INTO TABLE validation_logs 
SELECT 
"18364", 
"TABLE_VALIDATION", 
error.code, 
error.validator, 
get_json_object(key, '$.table_name'), 
NULL, 
NULL, 
error.failure_msg, 
FROM_UNIXTIME(UNIX_TIMESTAMP('20160921','yyyyMMdd')), 
from_unixtime(unix_timestamp()) 
FROM 
(SELECT 
MAP(concat("{\"table_name\" : \"", table_name , "\"}"), error) AS err_map 
FROM table_level_validation_result 
) AS res 
LATERAL VIEW EXPLODE(res.err_map) tmp AS key, error WHERE error IS NOT NULL AND (error.code="error" OR error.code="warn") 

2016-09-21T11:59:35,126 INFO [HiveServer2-Background-Pool: Thread-3883([])]: ql.Driver (SessionState.java:printInfo(1054)) - Query ID = hive_20160921115934_c56d9c91-640b-4f5d-b490-34549a4258c7 
2016-09-21T11:59:35,126 INFO [HiveServer2-Background-Pool: Thread-3883([])]: ql.Driver (SessionState.java:printInfo(1054)) - Total jobs = 1 
2016-09-21T11:59:35,127 INFO [HiveServer2-Background-Pool: Thread-3883([])]: ql.Driver (SessionState.java:printInfo(1054)) - Launching Job 1 out of 1 
2016-09-21T11:59:35,127 INFO [HiveServer2-Background-Pool: Thread-3883([])]: ql.Driver (Driver.java:launchTask(1856)) - Starting task [Stage-1:MAPRED] in serial mode 
2016-09-21T11:59:35,127 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.TezSessionPoolManager (TezSessionPoolManager.java:canWorkWithSameSession(404)) - The current user: hadoop, session user: hadoop 
2016-09-21T11:59:35,127 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.TezSessionPoolManager (TezSessionPoolManager.java:canWorkWithSameSession(421)) - Current queue name is null incoming queue name is null 
2016-09-21T11:59:35,173 INFO [HiveServer2-Background-Pool: Thread-3883([])]: ql.Context (Context.java:getMRScratchDir(340)) - New scratch dir is hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/65cf7f02-a7d3-40ba-a93f-ff5214afbdfc/hive_2016-09-21_11-59-34_474_5003281239065359634-127 
2016-09-21T11:59:35,174 INFO [HiveServer2-Background-Pool: Thread-3883([])]: exec.Task (TezTask.java:updateSession(279)) - Session is already open 
2016-09-21T11:59:35,175 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.DagUtils (DagUtils.java:createLocalResource(758)) - Resource modification time: 1474459142291 for hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/_tez_session_dir/85d36c12-c629-44a8-b23c-c628898a79b7/commons-vfs2-2.0.jar 
2016-09-21T11:59:35,176 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.DagUtils (DagUtils.java:createLocalResource(758)) - Resource modification time: 1474459142320 for hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/_tez_session_dir/85d36c12-c629-44a8-b23c-c628898a79b7/emr-ddb-hive.jar 
2016-09-21T11:59:35,177 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.DagUtils (DagUtils.java:createLocalResource(758)) - Resource modification time: 1474459142353 for hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/_tez_session_dir/85d36c12-c629-44a8-b23c-c628898a79b7/emr-hive-goodies.jar 
2016-09-21T11:59:35,178 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.DagUtils (DagUtils.java:createLocalResource(758)) - Resource modification time: 1474459142389 for hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/_tez_session_dir/85d36c12-c629-44a8-b23c-c628898a79b7/emr-kinesis-hive.jar 
2016-09-21T11:59:35,178 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.DagUtils (DagUtils.java:createLocalResource(758)) - Resource modification time: 1474459142423 for hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/_tez_session_dir/85d36c12-c629-44a8-b23c-c628898a79b7/hive-contrib-2.1.0-amzn-0.jar 
2016-09-21T11:59:35,179 INFO [HiveServer2-Background-Pool: Thread-3883([])]: tez.DagUtils (DagUtils.java:createLocalResource(758)) - Resource modification time: 1474459142496 for hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/_tez_session_dir/85d36c12-c629-44a8-b23c-c628898a79b7/hive-plugins-0.0.1-emr-upgrade-20160919.070538-1.jar 
2016-09-21T11:59:35,179 INFO [HiveServer2-Background-Pool: Thread-3883([])]: exec.Task (TezTask.java:build(321)) - Dag name: INSERT INTO TABLE valid...error.code="warn")(Stage-1) 
2016-09-21T11:59:35,180 INFO [HiveServer2-Background-Pool: Thread-3883([])]: ql.Context (Context.java:getMRScratchDir(340)) - New scratch dir is hdfs://ip-172-30-2-148.us-west-2.compute.internal:8020/tmp/hive/hadoop/65cf7f02-a7d3-40ba-a93f-ff5214afbdfc/hive_2016-09-21_11-59-34_474_5003281239065359634-127 
2016-09-21T11:59:35,223 INFO [HiveServer2-Background-Pool: Thread-3881([])]: impl.YarnClientImpl (YarnClientImpl.java:submitApplication(273)) - Submitted application application_1474442135017_0147 
2016-09-21T11:59:35,224 INFO [HiveServer2-Background-Pool: Thread-3881([])]: client.TezClient (TezClient.java:start(477)) - The url to track the Tez Session: http://ip-172-30-2-148.us-west-2.compute.internal:20888/proxy/application_1474442135017_0147/ 
2016-09-21T11:59:35,391 INFO [HiveServer2-Background-Pool: Thread-3429([])]: SessionState (SessionState.java:printInfo(1054)) - Map 1: 0(+0,-4)/1 
2016-09-21T11:59:35,446 ERROR [HiveServer2-Background-Pool: Thread-3429([])]: SessionState (SessionState.java:printError(1063)) - Status: Failed 
2016-09-21T11:59:35,447 ERROR [HiveServer2-Background-Pool: Thread-3429([])]: SessionState (SessionState.java:printError(1063)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1474442135017_0134_2_00, diagnostics=[Task failed, taskId=task_1474442135017_0134_2_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task (failure) : attempt_1474442135017_0134_2_00_000000_0:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:198) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160) 
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) 
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152) 
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:360) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) 
    ... 14 more 
Caused by: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:299) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203) 
    ... 19 more 
Caused by: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:818) 
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:1193) 
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:771) 
    at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:168) 
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109) 
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:297) 
    ... 20 more 
], TaskAttempt 1 failed, info=[Error: Error while running task (failure) : attempt_1474442135017_0134_2_00_000000_1:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:198) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160) 
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) 
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152) 
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:360) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) 
    ... 14 more 
Caused by: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:299) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203) 
    ... 19 more 
Caused by: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:818) 
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:1193) 
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:771) 
    at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:168) 
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109) 
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:297) 
    ... 20 more 
], TaskAttempt 2 failed, info=[Error: Error while running task (failure) : attempt_1474442135017_0134_2_00_000000_2:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:198) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160) 
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) 
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152) 
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:360) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) 
    ... 14 more 
Caused by: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) 
    at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:299) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203) 
    ... 19 more 
Caused by: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:818) 
    at com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:1193) 
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:771) 
    at com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:168) 
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109) 
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) 
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:297) 
    ... 20 more 
], TaskAttempt 3 failed, info=[Error: Error while running task (failure) : attempt_1474442135017_0134_2_00_000000_3:java.lang.RuntimeException: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:198) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:160) 
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:370) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) 
    at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) 
    at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) 
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) 
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.RuntimeException: java.io.IOException: java.io.FileNotFoundException: No such file or directory 's3://data-platform-insights/data-platform/internal_test_automation/2016/09/21/18364/logs/validations/table_col_aggregate_validation_result/.hive-staging_hive_2016-09-21_11-57-58_703_5106478639780932144-1/_tmp.-ext-10000/000000_0.gz' 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:206) 
    at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:152) 
    at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:62) 
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:360) 
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172) 
    ... 14 more 

蜂巢日誌與調試模式下啓用 -

綠色顏色突出顯示異常。

根據我的理解,在例外之前,它將文件名替換爲其他名稱,所有這些都發生在S3中。因爲S3最終是一致的,所以有時候它會顯示這個異常,有時候它會工作。

https://docs.google.com/document/d/1cwXVqQ3p-xPFcBqU9AuD7C8z8rHjhUIHwPjY-nVpFK0/edit?usp=sharing

另外設置執行查詢之前配置單元配置屬性 -

set hive.mapjoin.smalltable.filesize = 2000000000 
set mapreduce.map.speculative = false 
set mapreduce.output.fileoutputformat.compress = true 
set hive.exec.compress.output = true 
set mapreduce.task.timeout = 6000000 
set hive.optimize.bucketmapjoin.sortedmerge = true 
set io.compression.codecs = org.apache.hadoop.io.compress.GzipCode 
set hive.auto.convert.sortmerge.join.noconditionaltask = false 
set hive.optimize.bucketmapjoin = true 
set hive.exec.compress.intermediate = true 
set hive.enforce.bucketmapjoin = true 
set mapred.output.compress = true 
set mapreduce.map.output.compress = true 
set hive.auto.convert.sortmerge.join = false 
set hive.auto.convert.join = false 
set mapreduce.reduce.speculative = false 
set mapred.output.compression.codec = org.apache.hadoop.io.compress.GzipCodec 
set hive.cache.expr.evaluation=false 
set mapred.output.compress=true 
set hive.exec.compress.output=true 
set mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec 
set io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec 
set hive.exec.compress.intermediate=true 
set mapreduce.map.output.compress=true 
set hive.auto.convert.join=false 
set mapreduce.map.speculative=false 
set mapreduce.reduce.speculative=false 

羣集信息 - 與32 GB的磁盤空間

  1. 一個數據節點。
  2. Hive - 2.1.0,執行引擎 - tez 0.8.3
  3. hadoop - 2.7。2

問題 -

  1. 爲什麼它被扔LeaseExpiredException?
  2. 是與LeaseExpiredException相關的Hive查詢失敗?
  3. 是否因爲配置屬性錯誤?

更新-1

按照這個答案 - LeaseExpiredException: No lease error on HDFS (Failed to close file)

我加

SET hive.exec.max.dynamic.partitions=100000; 
SET hive.exec.max.dynamic.partitions.pernode=100000; 

但隨後也顯示出同樣的異常。

回答

1

我解決了這個問題。讓我詳細解釋一下。

例外來了 -

  1. LeaveExpirtedException - 從HDFS側。
  2. FileNotFoundException異常 - 從蜂巢邊時(TEZ執行引擎執行DAG)

問題scenario-

  1. 我們剛剛升級的蜂巢版本從0.13.0至2.1.0。而且,以前的版本一切正常。零運行時異常。

不同的想法來解決問題 -

  1. 首先想到的是,兩個線程正在研究,因爲NN情報的同一塊。但是,按照下面的設置

    集mapreduce.map.speculative =假 集mapreduce.reduce.speculative =假

這是不可能的。

  • 然後,我增加的計數爲1000〜100000以下設置 -

    SET hive.exec.max.dynamic.partitions = 100000; SET hive.exec.max.dynamic.partitions.pernode = 100000;

  • 那也沒有工作。

    1. 然後第三個想法是,在同一個過程中,映射器-1的創建被另一個映射器/縮減器刪除。但是,我們在Hveserver2,Tez日誌中沒有找到任何這樣的日誌。

    2. 最後,根本原因在於應用層代碼本身。在蜂房的exec-2.1.0版本,他們推出了新的配置屬性

      「hive.exec.stagingdir」: 「蜂房升級。」 上述房產的

    描述 -

    將在表格位置內創建的目錄名稱,以便 支持HDFS加密。對於 查詢結果,這將取代$ {hive.exec.scratchdir},但只讀表格除外。在所有情況下, $ {hive.exec.scratchdir}仍用於其他臨時文件,例如 作爲工作計劃。

    因此,如果在應用層代碼(ETL)中有任何併發​​作業,並且正在同一張表上執行操作(重命名/刪除/移動),則可能導致此問題。

    而在我們的例子中,2個併發作業在同一個表上執行「INSERT OVERWRITE」,導致刪除1個映射器的元數據文件,這是導致此問題的原因。

    分辨率 -

    1. 移動所述元數據文件位置到外表(表在於S3)。
    2. 禁用HDFS加密(如stagingdir屬性說明中所述)
    3. 更改爲您的應用程序層代碼以避免併發問題。

    相關問題 - Why hive_staging file is missing in AWS EMR