2011-08-12 20 views
0

我有以下查詢:選擇首選排出來的部分重複數據

select 
    mb.id as meter_id 
    ,ds.mydate as mydate 
    ,mb.name as metergroup 
    ,sum(ms.stand) as measured_cum_value 
    ,me.name as energy_medium 
    ,e.name as unit_of_measure 
    ,min(ms.source) as source 
    ,count(*) as debugcount 
FROM datumselect ds       <<-- mem table with dates to query. 
INNER JOIN metergroup mb ON (mb.building_id = 1) 
INNER JOIN meter m ON (m.metergroup_id = mb.id) <<-- meters are grouped 
INNER JOIN medium me ON (me.id = mb.medium_id) <<-- lookuptables for normalization 
INNER JOIN unit e ON (e.id = mb.unit_id)   <<-- ditto 
INNER JOIN meterstand ms ON (ms.meter_id = m.id AND ms.mydate = ds.mydate) 
group by ds.mydate, mb.id, ms.source <<-- this is prob. broken. 
having source = MIN(ms.source) <<-- this `having` does not work ! 
ORDER BY mb.id, ds.mydate 

我從下表中選擇:

CREATE TABLE meterstand(
    id INT(11) UNSIGNED NOT NULL AUTO_INCREMENT, 
    meter_id INT(11) UNSIGNED NOT NULL, 
    mydate DATETIME NOT NULL, 
    stand DECIMAL(16, 5) NOT NULL, 
    source ENUM('calculated', 'read', 'manual') NOT NULL DEFAULT 'read', 
    PRIMARY KEY (id), 
    INDEX FK_meterstand_meter_id (meter_id), 
    UNIQUE INDEX UK_meterstand (datum, meter_id, bron), 
    CONSTRAINT FK_meterstand_meter_id FOREIGN KEY (meter_id) 
    REFERENCES vaanstermeters.meter (id) ON DELETE RESTRICT ON UPDATE CASCADE 
) 
ENGINE = INNODB 
AUTO_INCREMENT = 181 
AVG_ROW_LENGTH = 105 
CHARACTER SET latin1 
COLLATE latin1_swedish_ci; 

給出下面的數據將是一個簡單的查詢:

SELECT 
    meter_id 
    , mydate 
    , sum(stand) 
    , count(*) as debugcount 
FROM meterstand 
WHERE mydate IN (list_of_dates_im_interested_in) 
GROUP BY meter_id, my_date 
HAVING the_best(source) 

鑑於目前的數據debugcount應該是1一如既往,但如果有多個計在上述查詢debugcount中的組中應該是組中的米數。

我可以在不同來源的值之間進行選擇,我有:
- manual來源,這是金色的;
- read來自數據源的來源,某處建築物中的儀表;
- calculated數據,內插補充丟失的數據。

具有相同的一個數據點meter_id + mydate可以有多個來源。
查詢應該利用manual源對read和只有選擇calculated數據,如果沒有其他數據可用。

下面是數據的meterstand樣本:

id meter_id mydate stand  source 
------------------------------------------------------ 
179 6 1-12-2010 94,75886 calculated 
180 7 1-12-2010 256,02618 calculated 
164 7 1-1-2011 285,41800 manual <<--- Query should only consider this row. 
183 7 1-1-2011 0,00000  read <<-- and forget about this one 

什麼是使用來選擇最佳的數據點正確的查詢語法?

回答

1

從外觀上看,MySQL將枚舉的排序順序定義爲它們在定義中列出的順序。既然你已經定義的順序相反,其中他們出現,我相信如預期下面的工作(例如沒有要測試的,雖然):

SELECT * 
FROM meterstand as a 
JOIN (SELECT meter_id, mydate, MAX(source) as source 
     FROM meterstand 
     GROUP BY meter_id, mydate) as b 
ON b.meter_id = a.meter_id 
AND b.mydate = a.mydate 
AND b.source = a.source 

(假設[meter_id,指明MyDate ,來源]當然是唯一的)。

看起來好像有一個錯誤導致枚舉按字符串值排序(根據字符串,它根本無法幫助您)。
如果窗臺存在(或您想對使用順序多一點控制),你可能要定義一個表:

Meter_Reading_Type 
======================== 
Id Description Priority 
1 Manual  10 
2 Calculated 30 
3 Read   20 

然後引用它作爲一個FK和排序(分)優先。

+0

這麼簡單,我想我被卡在一個mindwarp複雜的東西。謝謝。 – Johan