2012-06-06 186 views
2

我將通過在一個表中使用另一個表中的範圍(由兩列表示)中的單個位置來連接兩個表。帶範圍標準的連接表上的MySQL優化

但是,表現太慢了,大約20分鐘。 我已經嘗試在表上添加索引或更改查詢。 但表現仍然不佳。

所以,我要求優化加入速度。


以下是對MySQL的查詢。

mysql> SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score 
    -> FROM `inVar` 
    -> LEFT JOIN `openChrom_K562` 
    -> ON (
    -> `inVar`.chrom=`openChrom_K562`.chrom AND 
    -> `inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd 
    ->); 

inVaropenChrom_K562是我使用的表。

inVar存儲每行中的單個位置。

openChrom_K562存儲由chromStartchromEnd指示的範圍信息。

inVar包含57902行並且openChrom_K562分別具有137373行。


表上的字段。

mysql> DESCRIBE inVar; 
+-------+-------------+------+-----+---------+-------+ 
| Field | Type  | Null | Key | Default | Extra | 
+-------+-------------+------+-----+---------+-------+ 
| chrom | varchar(31) | NO | PRI | NULL |  | 
| pos | int(10)  | NO | PRI | NULL |  | 
+-------+-------------+------+-----+---------+-------+ 

mysql> DESCRIBE openChrom_K562; 
+------------+-------------+------+-----+---------+-------+ 
| Field  | Type  | Null | Key | Default | Extra | 
+------------+-------------+------+-----+---------+-------+ 
| chrom  | varchar(31) | NO | MUL | NULL |  | 
| chromStart | int(10)  | NO | MUL | NULL |  | 
| chromEnd | int(10)  | NO |  | NULL |  | 
| score  | int(10)  | NO |  | NULL |  | 
+------------+-------------+------+-----+---------+-------+ 

指數內置表

mysql> SHOW INDEX FROM inVar; 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| inVar |   0 | PRIMARY |   1 | chrom  | A   |  NULL |  NULL | NULL |  | BTREE  |   | 
| inVar |   0 | PRIMARY |   2 | pos   | A   |  57902 |  NULL | NULL |  | BTREE  |   | 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 

mysql> SHOW INDEX FROM openChrom_K562; 
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| Table   | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | 
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 
| openChrom_K562 |   1 | start_end |   1 | chromStart | A   |  137373 |  NULL | NULL |  | BTREE  |   | 
| openChrom_K562 |   1 | start_end |   2 | chromEnd | A   |  137373 |  NULL | NULL |  | BTREE  |   | 
| openChrom_K562 |   1 | chrom_only |   1 | chrom  | A   |   22 |  NULL | NULL |  | BTREE  |   | 
| openChrom_K562 |   1 | chrom_start |   1 | chrom  | A   |   22 |  NULL | NULL |  | BTREE  |   | 
| openChrom_K562 |   1 | chrom_start |   2 | chromStart | A   |  137373 |  NULL | NULL |  | BTREE  |   | 
| openChrom_K562 |   1 | chrom_end |   1 | chrom  | A   |   22 |  NULL | NULL |  | BTREE  |   | 
| openChrom_K562 |   1 | chrom_end |   2 | chromEnd | A   |  137373 |  NULL | NULL |  | BTREE  |   | 
+----------------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+ 

執行計劃中關於MySQL

mysql> EXPLAIN SELECT `inVar`.chrom, `inVar`.pos, score FROM `inVar` LEFT JOIN `openChrom_K562` ON (inVar.chrom=openChrom_K562.chrom AND `inVar`.pos BETWEEN chromStart AND chromEnd); 
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+ 
| id | select_type | table   | type | possible_keys        | key  | key_len | ref    | rows | Extra  | 
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+ 
| 1 | SIMPLE  | inVar   | index | NULL          | PRIMARY | 37  | NULL   | 57902 | Using index | 
| 1 | SIMPLE  | openChrom_K562 | ref | start_end,chrom_only,chrom_start,chrom_end | chrom_only | 33  | tmp.inVar.chrom | 5973 |    | 
+----+-------------+----------------+-------+--------------------------------------------+------------+---------+-----------------+-------+-------------+ 

似乎只有它在兩個表看chrom優化。然後在表格中進行蠻力比較。

有沒有什麼辦法可以進一步優化如索引的位置?

(這是我第一次發佈的問題,遺憾的發佈質量較差。)

+0

是否查詢計劃的改變在所有如果你把BETWEEN部分中的WHERE子句中,而不是讓它成爲ON的一部分的查詢應該更快條款? (我不認爲這會改善事情,但似乎值得檢查。) – Ilion

+0

@Ilion,我試過了,沒有什麼不同...... –

回答

0

chrom_only很可能是一個壞的索引選擇你的加入,你只有CHROM 22個值。

如果我已經解釋這個權利,如果使用start_end

SELECT `inVar`.chrom, `inVar`.pos, `openChrom_K562`.score 
FROM `inVar` 
LEFT JOIN `openChrom_K562` 
USE INDEX (`start_end`) 
ON (
`inVar`.chrom=`openChrom_K562`.chrom AND 
`inVar`.pos BETWEEN `openChrom_K562`.chromStart AND `openChrom_K562`.chromEnd 
) 
+0

謝謝你的回覆。 由於表加入範圍真的很差... 我已經通過構建間隔樹切換到另一種方法:P –

+0

@EricHo,使用間隔樹似乎很好,但是有沒有辦法在SQL中使用它們(with)?我有或多或少的[類似的問題在這裏](http://stackoverflow.com/q/27433474/559784)。 – Arun