2011-05-24 94 views
0

我目前正在嘗試優化一個MySQL查詢,該查詢在具有10,000多行的表上運行速度稍慢。可以優化此MySQL查詢嗎?

CREATE TABLE IF NOT EXISTS `person` (
    `_id` int(11) unsigned NOT NULL AUTO_INCREMENT, 
    `_oid` char(8) NOT NULL, 
    `firstname` varchar(255) NOT NULL, 
    `lastname` varchar(255) NOT NULL, 
    PRIMARY KEY (`_id`), 
    KEY `_oid` (`_oid`) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8; 

CREATE TABLE IF NOT EXISTS `person_cars` (
    `_id` int(11) NOT NULL AUTO_INCREMENT, 
    `_oid` char(8) NOT NULL, 
    `idx` varchar(255) NOT NULL, 
    `val` blob NOT NULL, 
    PRIMARY KEY (`_id`), 
    KEY `_oid` (`_oid`), 
    KEY `idx` (`idx`), 
    KEY `val` (`val`(64)) 
) ENGINE=MyISAM DEFAULT CHARSET=utf8; 

# Insert some 10000+ rows… 

INSERT INTO `person` (`_oid`,`firstname`,`lastname`) 
VALUES 
    ('1', 'John', 'Doe'), 
    ('2', 'Jack', 'Black'), 
    ('3', 'Jim', 'Kirk'), 
    ('4', 'Forrest', 'Gump'); 

INSERT INTO `person_cars` (`_oid`,`idx`,`val`) 
VALUES 
    ('1', '0', 'BMW'), 
    ('1', '1', 'PORSCHE'), 
    ('2', '0', 'BMW'), 
    ('3', '1', 'MERCEDES'), 
    ('3', '0', 'TOYOTA'), 
    ('3', '1', 'NISSAN'), 
    ('4', '0', 'OLDMOBILE'); 


SELECT `_person`.`_oid`, 
     `_person`.`firstname`, 
     `_person`.`lastname`, 
     `_person_cars`.`cars[0]`, 
     `_person_cars`.`cars[1]` 

FROM `person` `_person` 

LEFT JOIN (

    SELECT `_person`.`_oid`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]` 
    FROM `person` `_person` 
    JOIN `person_cars` `_person_cars` ON `_person`.`_oid` = `_person_cars`.`_oid` 
    GROUP BY `_person`.`_oid` 

) `_person_cars` ON `_person_cars`.`_oid` = `_person`.`_oid` 

WHERE `cars[0]` = 'BMW' OR `cars[1]` = 'BMW'; 

上面的SELECT查詢在運行MySQL 5.1.53的虛擬機上花費〜170ms。與約。兩個表中的每一行都有10,000行。

+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+ 
| id | select_type | table  | type | possible_keys | key | key_len | ref | rows | Extra          | 
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+ 
| 1 | PRIMARY  | <derived2> | ALL | NULL   | NULL | NULL | NULL | 4 | Using where         | 
| 1 | PRIMARY  | _person  | ALL | _oid   | NULL | NULL | NULL | 4 | Using where; Using join buffer    | 
| 2 | DERIVED  | _person_cars | ALL | _oid   | NULL | NULL | NULL | 7 | Using temporary; Using filesort    | 
| 2 | DERIVED  | _person  | index | _oid   | _oid | 24  | NULL | 4 | Using where; Using index; Using join buffer | 
+----+-------------+--------------+-------+---------------+------+---------+------+------+---------------------------------------------+ 

有些10,000行給出的結果:

當我解釋一下上面的查詢,結果取決於有多少行是每個表中不同

+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+ 
| id | select_type | table  | type | possible_keys | key | key_len | ref     | rows | Extra       | 
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+ 
| 1 | PRIMARY  | <derived2> | ALL | NULL   | NULL | NULL | NULL     | 6613 | Using where      | 
| 1 | PRIMARY  | _person  | ref | _oid   | _oid | 24  | _person_cars._oid  | 10 |         | 
| 2 | DERIVED  | _person_cars | ALL | _oid   | NULL | NULL | NULL     | 9913 | Using temporary; Using filesort | 
| 2 | DERIVED  | _person  | ref | _oid   | _oid | 24  | test._person_cars._oid | 10 | Using index      | 
+----+-------------+--------------+------+---------------+------+---------+------------------------+------+---------------------------------+ 

事情變得更糟,當我省略WHERE子句或當我加入與person_cars類似的另一個表時。

有沒有人有一個想法如何優化SELECT查詢使事情變得更快一點?

+0

兩個簡單的問題,爲什麼您使用用於存儲車賺了BLOB數據類型?還有,你是否考慮過使用InnoDb作爲MyISAM的數據庫引擎? – GordyD 2011-05-24 10:16:02

+0

我使用blob是因爲我必須在該列中存儲任意長度的數據。汽車只是一個例子。而且,不,我還沒有嘗試InnoDb,因爲該項目根本不使用InnoDb。我會給它一個鏡頭,謝謝:) – xlttj 2011-05-24 11:10:43

+0

還要記住有BLOB或TEXT字段使所有臨時表(在連接和排序過程中)實際上是在磁盤表上。 – Marki555 2012-01-09 20:18:25

回答

1

這是緩慢的,因爲這將迫使該再得到結合在一起的人三個全表掃描:

LEFT JOIN (
    ... 
    GROUP BY `_person`.`_oid` -- the group by here 
) `_person_cars` ... 

WHERE ... -- and the where clauses on _person_cars. 

考慮where子句中的左連接實際上是一個內部聯接,換一個。你可以在加入人員之前推動條件。該連接也被不必要地應用兩次。

這將在子查詢使其更快,但如果你用/限制條款已經命令它仍然會導致全表掃描的人員(即還沒有好),因爲該組:

JOIN (
SELECT `_person_cars`.`_oid`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) AS `cars[0]`, 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) AS `cars[1]` 
    FROM `person_cars` 
    GROUP BY `_person_cars`.`_oid` 
    HAVING IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=0, `_person_cars`.`val`, NULL)), NULL) = 'BMW' OR 
      IFNULL(GROUP_CONCAT(IF(`_person_cars`.`idx`=1, `_person_cars`.`val`, NULL)), NULL) = 'BMW' 
) `_person_cars` ... -- smaller number of rows 

如果通過/限制適用的訂單,你會得到更好的結果有兩個疑問,即:

SELECT `_person`.`_oid`, 
     `_person`.`firstname`, 
     `_person`.`lastname` 
FROM `_person` 
JOIN `_person_cars` 
ON `_person_cars`.`_oid` = `_person`.`_oid` 
AND `_person_cars`.`val` = 'BMW' 
GROUP BY -- pre-sort the result before grouping, so as to not do the work twice 
     `_person`.`lastname`, 
     `_person`.`firstname`, 
     -- eliminate users with multiple BMWs 
     `_person`.`_oid` 
ORDER BY `_person`.`lastname`, 
     `_person`.`firstname`, 
     `_person`.`_oid` 
LIMIT 10 

,然後使用所產生的ID的IN()子句選擇汽車。

哦,你的vals列可能應該是一個varchar。

+0

感謝您的廣泛答覆,我會仔細研究這一點,並試着瞭解... – xlttj 2011-05-24 11:58:24

0

入住這

SELECT 
    p._oid  AS oid, 
    p.firstname AS firstname, 
    p.lastname AS lastname, 
    pc.val  AS car1, 
    pc2.val  AS car2 
FROM person AS p 
    LEFT JOIN person_cars AS pc 
    ON pc._oid = p._oid 
     AND pc.idx = 0 
    LEFT JOIN person_cars AS pc2 
    ON pc2._oid = p._oid 
     AND pc2.idx = 1 
WHERE pc.val = 'BMW' 
    OR pc2.val = 'BWM'