2017-04-06 166 views
0

編輯:簡單計數查詢超出Impala內存限制

表中有一些損壞的AVRO文件。刪除其中的一些後,每件事情都很好。我使用avro-tools將這些文件解壓縮爲json,解壓縮後的文件也不是很大。所以它似乎是Impala中處理損壞的AVRO文件的一些錯誤。


我有一個Impala表與Gziped AVRO格式,它是由「日」分區。當我執行查詢:

select count(0) from adhoc_data_fast.log where day='2017-04-05';

它說:

Query: select count(0) from adhoc_data_fast.log where day='2017-04-05' 
Query submitted at: 2017-04-06 13:35:56 (Coordinator: http://szq7.appadhoc.com:25000) 
Query progress can be monitored at: http://szq7.appadhoc.com:25000/query_plan?query_id=ef4698db870efd4d:739c89ef00000000 
WARNINGS: 
Memory limit exceeded 
GzipDecompressor failed to allocate 109051904000 bytes. 

每個節點被配置有96 GB存儲器和所述單池內存限制設置爲300 GB。

所有文件壓縮後不大於250MB大:

62M  log.2017-04-05.1491321605834.avro 
79M  log.2017-04-05.1491323647211.avro 
62M  log.2017-04-05.1491327241311.avro 
60M  log.2017-04-05.1491330839609.avro 
52M  log.2017-04-05.1491334439092.avro 
59M  log.2017-04-05.1491338038503.avro 
93M  log.2017-04-05.1491341639694.avro 
130M log.2017-04-05.1491345239969.avro 
147M log.2017-04-05.1491348843931.avro 
183M log.2017-04-05.1491352442955.avro 
218M log.2017-04-05.1491359648079.avro 
181M log.2017-04-05.1491363247597.avro 
212M log.2017-04-05.1491366845827.avro 
207M log.2017-04-05.1491370445873.avro 
197M log.2017-04-05.1491374045830.avro 
164M log.2017-04-05.1491377650935.avro 
155M log.2017-04-05.1491381249597.avro 
203M log.2017-04-05.1491384846366.avro 
185M log.2017-04-05.1491388450262.avro 
198M log.2017-04-05.1491392047694.avro 
206M log.2017-04-05.1491395648818.avro 
214M log.2017-04-05.1491399246407.avro 
167M log.2017-04-05.1491402846469.avro 
77M  log.2017-04-05.1491406180615.avro 
3.2M log.2017-04-05.1491409790105.avro 
1.3M log.2017-04-05.1491413385884.avro 
928K log.2017-04-05.1491416981829.avro 
832K log.2017-04-05.1491420581588.avro 
1.1M log.2017-04-05.1491424180191.avro 
2.6M log.2017-04-05.1491427781339.avro 
3.8M log.2017-04-05.1491431382552.avro 
3.3M log.2017-04-05.1491434984679.avro 
5.2M log.2017-04-05.1491438586674.avro 
5.1M log.2017-04-05.1491442192541.avro 
2.3M log.2017-04-05.1491445789230.avro 
884K log.2017-04-05.1491449386630.avro 

而且我從HDFS得到它們,並使用Avro的工具將它們轉換,以解壓縮到JSON。解壓縮後的文件大小不能超過1GB:

16M  log.2017-04-05.1491321605834.avro.json 
308M log.2017-04-05.1491323647211.avro.json 
103M log.2017-04-05.1491327241311.avro.json 
150M log.2017-04-05.1491330839609.avro.json 
397M log.2017-04-05.1491334439092.avro.json 
297M log.2017-04-05.1491338038503.avro.json 
160M log.2017-04-05.1491341639694.avro.json 
95M  log.2017-04-05.1491345239969.avro.json 
360M log.2017-04-05.1491348843931.avro.json 
338M log.2017-04-05.1491352442955.avro.json 
71M  log.2017-04-05.1491359648079.avro.json 
161M log.2017-04-05.1491363247597.avro.json 
628M log.2017-04-05.1491366845827.avro.json 
288M log.2017-04-05.1491370445873.avro.json 
162M log.2017-04-05.1491374045830.avro.json 
90M  log.2017-04-05.1491377650935.avro.json 
269M log.2017-04-05.1491381249597.avro.json 
620M log.2017-04-05.1491384846366.avro.json 
70M  log.2017-04-05.1491388450262.avro.json 
30M  log.2017-04-05.1491392047694.avro.json 
114M log.2017-04-05.1491395648818.avro.json 
370M log.2017-04-05.1491399246407.avro.json 
359M log.2017-04-05.1491402846469.avro.json 
218M log.2017-04-05.1491406180615.avro.json 
29M  log.2017-04-05.1491409790105.avro.json 
3.9M log.2017-04-05.1491413385884.avro.json 
9.3M log.2017-04-05.1491416981829.avro.json 
8.3M log.2017-04-05.1491420581588.avro.json 
2.3M log.2017-04-05.1491424180191.avro.json 
25M  log.2017-04-05.1491427781339.avro.json 
24M  log.2017-04-05.1491431382552.avro.json 
5.7M log.2017-04-05.1491434984679.avro.json 
35M  log.2017-04-05.1491438586674.avro.json 
5.8M log.2017-04-05.1491442192541.avro.json 
23M  log.2017-04-05.1491445789230.avro.json 
4.3M log.2017-04-05.1491449386630.avro.json 

這裏是黑斑羚分析:

[szq7.appadhoc.com:21000] > profile; 
Query Runtime Profile: 
Query (id=ef4698db870efd4d:739c89ef00000000): 
    Summary: 
    Session ID: f54bb090170bcdb6:621ac5796ef2668c 
    Session Type: BEESWAX 
    Start Time: 2017-04-06 13:35:56.454441000 
    End Time: 2017-04-06 13:35:57.326967000 
    Query Type: QUERY 
    Query State: EXCEPTION 
    Query Status: 
Memory limit exceeded 
GzipDecompressor failed to allocate 109051904000 bytes. 

    Impala Version: impalad version 2.7.0-cdh5.9.1 RELEASE (build 24ad6df788d66e4af9496edb26ac4d1f1d2a1f2c) 
    User: ubuntu 
    Connected User: ubuntu 
    Delegated User: 
    Network Address: ::ffff:192.168.1.7:29026 
    Default Db: default 
    Sql Statement: select count(0) from adhoc_data_fast.log where day='2017-04-05' 
    Coordinator: szq7.appadhoc.com:22000 
    Query Options (non default): 
    Plan: 
---------------- 
Estimated Per-Host Requirements: Memory=410.00MB VCores=1 
WARNING: The following tables are missing relevant table and/or column statistics. 
adhoc_data_fast.log 

03:AGGREGATE [FINALIZE] 
| output: count:merge(0) 
| hosts=13 per-host-mem=unavailable 
| tuple-ids=1 row-size=8B cardinality=1 
| 
02:EXCHANGE [UNPARTITIONED] 
| hosts=13 per-host-mem=unavailable 
| tuple-ids=1 row-size=8B cardinality=1 
| 
01:AGGREGATE 
| output: count(0) 
| hosts=13 per-host-mem=10.00MB 
| tuple-ids=1 row-size=8B cardinality=1 
| 
00:SCAN HDFS [adhoc_data_fast.log, RANDOM] 
    partitions=1/7594 files=38 size=3.45GB 
    table stats: unavailable 
    column stats: all 
    hosts=13 per-host-mem=400.00MB 
    tuple-ids=0 row-size=0B cardinality=unavailable 
---------------- 
    Estimated Per-Host Mem: 429916160 
    Estimated Per-Host VCores: 1 
    Tables Missing Stats: adhoc_data_fast.log 
    Request Pool: default-pool 
    Admission result: Admitted immediately 
    ExecSummary: 
Operator  #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail 
------------------------------------------------------------------------------------------------------------- 
03:AGGREGATE  1 52.298ms 52.298ms  0   1 4.00 KB  -1.00 B FINALIZE 
02:EXCHANGE   1 676.993ms 676.993ms  0   1   0  -1.00 B UNPARTITIONED 
01:AGGREGATE  0 0.000ns 0.000ns  0   1   0  10.00 MB 
00:SCAN HDFS  0 0.000ns 0.000ns  0   -1   0  400.00 MB adhoc_data_fast.log 
    Planner Timeline: 69.589ms 
     - Analysis finished: 6.642ms (6.642ms) 
     - Equivalence classes computed: 6.980ms (337.753us) 
     - Single node plan created: 13.302ms (6.322ms) 
     - Runtime filters computed: 13.368ms (65.984us) 
     - Distributed plan created: 15.131ms (1.763ms) 
     - Lineage info computed: 16.488ms (1.356ms) 
     - Planning finished: 69.589ms (53.101ms) 
    Query Timeline: 874.026ms 
     - Start execution: 63.320us (63.320us) 
     - Planning finished: 72.764ms (72.701ms) 
     - Submit for admission: 73.592ms (827.496us) 
     - Completed admission: 73.775ms (183.088us) 
     - Ready to start 13 remote fragments: 126.950ms (53.175ms) 
     - All 13 remote fragments started: 161.919ms (34.968ms) 
     - Rows available: 856.761ms (694.842ms) 
     - Unregister query: 872.527ms (15.765ms) 
    - ComputeScanRangeAssignmentTimer: 356.136us 
    ImpalaServer: 
    - ClientFetchWaitTimer: 0.000ns 
    - RowMaterializationTimer: 0.000ns 
    Execution Profile ef4698db870efd4d:739c89ef00000000:(Total: 782.712ms, non-child: 0.000ns, % non-child: 0.00%) 
    Number of filters: 0 
    Filter routing table: 
ID Src. Node Tgt. Node(s) Targets Target type Partition filter Pending (Expected) First arrived Completed Enabled 
---------------------------------------------------------------------------------------------------------------------------- 

    Fragment start latencies: Count: 13, 25th %-ile: 1ms, 50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 2ms, 95th %-ile: 2ms, 99.9th %-ile: 35ms 
    Per Node Peak Memory Usage: szq15.appadhoc.com:22000(0) szq1.appadhoc.com:22000(0) szq13.appadhoc.com:22000(0) szq12.appadhoc.com:22000(0) szq11.appadhoc.com:22000(0) szq20.appadhoc.com:22000(0) szq14.appadhoc.com:22000(0) szq8 
.appadhoc.com:22000(0) szq5.appadhoc.com:22000(0) szq9.appadhoc.com:22000(0) szq4.appadhoc.com:22000(0) szq6.appadhoc.com:22000(0) szq7.appadhoc.com:22000(0) 
    - FiltersReceived: 0 (0) 
    - FinalizationTimer: 0.000ns 
    Coordinator Fragment F01:(Total: 729.811ms, non-child: 0.000ns, % non-child: 0.00%) 
     MemoryUsage(500.000ms): 12.00 KB 
     - AverageThreadTokens: 0.00 
     - BloomFilterBytes: 0 
     - PeakMemoryUsage: 12.00 KB (12288) 
     - PerHostPeakMemUsage: 0 
     - PrepareTime: 52.291ms 
     - RowsProduced: 0 (0) 
     - TotalCpuTime: 0.000ns 
     - TotalNetworkReceiveTime: 676.991ms 
     - TotalNetworkSendTime: 0.000ns 
     - TotalStorageWaitTime: 0.000ns 
     BlockMgr: 
     - BlockWritesOutstanding: 0 (0) 
     - BlocksCreated: 0 (0) 
     - BlocksRecycled: 0 (0) 
     - BufferedPins: 0 (0) 
     - BytesWritten: 0 
     - MaxBlockSize: 8.00 MB (8388608) 
     - MemoryLimit: 102.40 GB (109951164416) 
     - PeakMemoryUsage: 0 
     - TotalBufferWaitTime: 0.000ns 
     - TotalEncryptionTime: 0.000ns 
     - TotalIntegrityCheckTime: 0.000ns 
     - TotalReadBlockTime: 0.000ns 
     CodeGen:(Total: 63.837ms, non-child: 63.837ms, % non-child: 100.00%) 
     - CodegenTime: 828.728us 
     - CompileTime: 2.957ms 
     - LoadTime: 0.000ns 
     - ModuleBitcodeSize: 1.89 MB (1984232) 
     - NumFunctions: 7 (7) 
     - NumInstructions: 96 (96) 
     - OptimizationTime: 8.070ms 
     - PrepareTime: 51.769ms 
     AGGREGATION_NODE (id=3):(Total: 729.291ms, non-child: 52.298ms, % non-child: 7.17%) 
     ExecOption: Codegen Enabled 
     - BuildTime: 0.000ns 
     - GetResultsTime: 0.000ns 
     - HTResizeTime: 0.000ns 
     - HashBuckets: 0 (0) 
     - LargestPartitionPercent: 0 (0) 
     - MaxPartitionLevel: 0 (0) 
     - NumRepartitions: 0 (0) 
     - PartitionsCreated: 0 (0) 
     - PeakMemoryUsage: 4.00 KB (4096) 
     - RowsRepartitioned: 0 (0) 
     - RowsReturned: 0 (0) 
     - RowsReturnedRate: 0 
     - SpilledPartitions: 0 (0) 
     EXCHANGE_NODE (id=2):(Total: 676.993ms, non-child: 676.993ms, % non-child: 100.00%) 
     BytesReceived(500.000ms): 0 
     - BytesReceived: 0 
     - ConvertRowBatchTime: 0.000ns 
     - DeserializeRowBatchTimer: 0.000ns 
     - FirstBatchArrivalWaitTime: 0.000ns 
     - PeakMemoryUsage: 0 
     - RowsReturned: 0 (0) 
     - RowsReturnedRate: 0 
     - SendersBlockedTimer: 0.000ns 
     - SendersBlockedTotalTimer(*): 0.000ns 
    Averaged Fragment F00: 
     split sizes: min: 114.60 MB, max: 451.79 MB, avg: 271.65 MB, stddev: 104.16 MB 
     completion times: min:694.632ms max:728.356ms mean: 725.379ms stddev:8.878ms 
     execution rates: min:157.45 MB/sec max:620.68 MB/sec mean:374.89 MB/sec stddev:144.30 MB/sec 
     num instances: 13 
    Fragment F00: 
     Instance ef4698db870efd4d:739c89ef00000001 (host=szq5.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000002 (host=szq8.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000003 (host=szq14.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000004 (host=szq20.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000005 (host=szq11.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000006 (host=szq12.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000007 (host=szq13.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000008 (host=szq1.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef00000009 (host=szq15.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef0000000a (host=szq6.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef0000000b (host=szq4.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef0000000c (host=szq9.appadhoc.com:22000): 
     Instance ef4698db870efd4d:739c89ef0000000d (host=szq7.appadhoc.com:22000): 

那麼,爲什麼帕拉需要這麼多的內存?

回答

0

這可能是因爲Impala在該表上缺少該分區的統計信息。解釋計劃突出了以下幾點:

Estimated Per-Host Requirements: Memory=410.00MB VCores=1 
WARNING: The following tables are missing relevant table and/or column statistics. 
adhoc_data_fast.log 

嘗試在桌子上運行一個計算統計數據,或者一個計算增量統計信息的分區。

例如

COMPUTE INCREMENTAL STATS adhoc_data_fast.log PARTITION (day='2017-04-05'); 

這將幫助Impala進行資源規劃時。如果這能解決它,我會感到驚訝,但最初值得一試。

+0

謝謝!但是「計算狀態」也未能分配內存。事實證明,Impala中的一些錯誤可以處理損壞的AVRO文件。 –