2010-05-12 108 views
1

說明消除全表掃描由於BETWEEN(和GROUP BY)

按照explain命令,存在引起一個查詢,以執行全表掃描(160K行)的範圍內。如何保持範圍條件並減少掃描?我想到的罪魁禍首是:

Y.YEAR BETWEEN 1900 AND 2009 AND 

代碼

這裏是有範圍條件的代碼(STATION_DISTRICT可能是多餘的)。

SELECT                 
    COUNT(1) as MEASUREMENTS,            
    AVG(D.AMOUNT) as AMOUNT,            
    Y.YEAR as YEAR,              
    MAKEDATE(Y.YEAR,1) as AMOUNT_DATE          
FROM                  
    CITY C,                
    STATION S,                
    STATION_DISTRICT SD,             
    YEAR_REF Y FORCE INDEX(YEAR_IDX),          
    MONTH_REF M,               
    DAILY D                
WHERE                 
    -- For a specific city ...            
    --                  
    C.ID = 10663 AND              

    -- Find all the stations within a specific unit radius ... 
    --               
    6371.009 *             
    SQRT(              
    POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) + 
    (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL)/2) * 
    POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2))) <= 50 AND 

    -- Get the station district identification for the matching station. 
    --                 
    S.STATION_DISTRICT_ID = SD.ID AND         

    -- Gather all known years for that station ... 
    --            
    Y.STATION_DISTRICT_ID = SD.ID AND    

    -- The data before 1900 is shaky; insufficient after 2009. 
    --               
    Y.YEAR BETWEEN 1900 AND 2009 AND       

    -- Filtered by all known months ... 
    --         
    M.YEAR_REF_ID = Y.ID AND   

    -- Whittled down by category ... 
    -- 
    M.CATEGORY_ID = '003' AND 

    -- Into the valid daily climate data. 
    -- 
    M.ID = D.MONTH_REF_ID AND 
    D.DAILY_FLAG_ID <> 'M' 
GROUP BY 
    Y.YEAR 

更新

的SQL是執行全表掃描,這將導致在MySQL執行 「複製到tmp下表」,如下所示:

 
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ 
| id | select_type | table | type | possible_keys      | key   | key_len | ref       | rows | Extra  | 
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ 
| 1 | SIMPLE  | C  | const | PRIMARY       | PRIMARY  | 4  | const       |  1 |    | 
| 1 | SIMPLE  | Y  | range | YEAR_IDX       | YEAR_IDX  | 4  | NULL       | 160422 | Using where | 
| 1 | SIMPLE  | SD | eq_ref | PRIMARY       | PRIMARY  | 4  | climate.Y.STATION_DISTRICT_ID |  1 | Using index | 
| 1 | SIMPLE  | S  | eq_ref | PRIMARY       | PRIMARY  | 4  | climate.SD.ID     |  1 | Using where | 
| 1 | SIMPLE  | M  | ref | PRIMARY,YEAR_REF_IDX,CATEGORY_IDX | YEAR_REF_IDX | 8  | climate.Y.ID     |  54 | Using where | 
| 1 | SIMPLE  | D  | ref | INDEX        | INDEX  | 8  | climate.M.ID     |  11 | Using where | 
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ 

回答

使用後STRAIGHT_JOIN

 
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+ 
| id | select_type | table | type | possible_keys      | key   | key_len | ref       | rows | Extra       | 
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+ 
| 1 | SIMPLE  | C  | const | PRIMARY       | PRIMARY  | 4  | const       | 1 | Using temporary; Using filesort | 
| 1 | SIMPLE  | S  | ALL | PRIMARY       | NULL   | NULL | NULL       | 7795 | Using where      | 
| 1 | SIMPLE  | SD | eq_ref | PRIMARY       | PRIMARY  | 4  | climate.S.STATION_DISTRICT_ID | 1 | Using index      | 
| 1 | SIMPLE  | Y  | ref | PRIMARY,STAT_YEAR_IDX    | STAT_YEAR_IDX | 4  | climate.S.STATION_DISTRICT_ID | 1650 | Using where      | 
| 1 | SIMPLE  | M  | ref | PRIMARY,YEAR_REF_IDX,CATEGORY_IDX | YEAR_REF_IDX | 8  | climate.Y.ID     | 54 | Using where      | 
| 1 | SIMPLE  | D  | ref | INDEX        | INDEX   | 8  | climate.M.ID     | 11 | Using where      | 
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+ 

相關

謝謝!

回答

2

ONE請求...它看起來像你知道你的數據。添加關鍵字「STRAIGHT_JOIN」,看看結果...

SELECT STRAIGHT_JOIN ...您的查詢休息...

直加盟告訴MySQL要做到這一點,因爲我有列出。因此,您的CITY表格是FROM列表中的第一個,因此表明您期望這是您的主要...此外,您的CITY的WHERE子句是立即過濾器。據說,它可能會通過查詢的其餘部分...

希望它可以幫助...它爲我工作與gov't數據的百萬記錄查詢和加入到10 +查找表mySql試圖爲我思考。

+0

我很驚訝MySQL優化它不正確。它有幫助。查詢現在需要大約一半的時間。仍然需要優化MySQL服務器,但至少我不需要擔心大量的全表掃描了。 – 2010-05-12 02:45:35

0

爲了做到高效率之間查詢您要在YEAR列上查詢b樹索引。例如:

CREATE INDEX id_index USING BTREE ON YEAR_REF (YEAR); 

BTREE指標實現了有效的範圍查詢,如果這是事實,那麼有這樣一個指標應該擺脫全表掃描的根本問題,並讓它只掃描表的一部分那是在範圍內。閱讀更多關於btrees的信息wikipedia

但是,與任何優化建議一樣,您應該進行測量以確保您不會造成更多的傷害。

+0

謝謝,盧克。所有的表都有索引。他們中的一些人對索引有索引,因爲我很偏執。 ;-)(只是在開玩笑!) – 2010-05-12 02:44:01

0

你可以在半徑內搜索以在邊界框中搜索嗎?

你知道這個城市,所以你可以在你的應用程序中計算一個邊界框。

也許這

S.LATITUDE_DECIMAL >= latitude_lower and 
S.LATITUDE_DECIMAL <= latitude_upper and 
S.LONGITUDE_DECIMAL >= longitude_lower and 
S.LONGITUDE_DECIMAL <= longitude_upper 

可能會快一點?

+0

正在考慮這樣做,但我喜歡圈子。 :-) – 2010-05-12 16:16:39