2013-08-02 1448 views
4
# dt---------indx_nm1-----indx_val1-------indx_nm2------indx_val2 
2009-06-08----ABQI------1001.2------------ACNACTR----------300.05 
2009-06-09----ABQI------1002.12 ----------ACNACTR----------341.19 
2009-06-10----ABQI------1011.4------------ACNACTR----------382.93 
2009-06-11----ABQI------1015.43 ----------ACNACTR----------362.63 

我有一張看起來像^(但有數百行,從2009年到2013年)的表。是否有一種方法可以計算協方差:[(indx_val1 - avg(indx_val1))*(indx_val2 - avg(indx_val2)]除以每個值indx_val1indx_val2(循環遍歷整個表格)的總行數和通過dt爲COV只返回一個簡單的值(ABQIACNACTR使用SQL查找協變性

+0

@MichaelBerkowski – euge1220

+0

好,感謝您的反饋! – euge1220

回答

4

既然你有過兩個不同的組操作聚集,你將需要兩個不同的查詢。其中主要的一個組每日期,讓您的行值。其他查詢具有橫跨整個行集執行AVG()COUNT()聚集體。

要同時使用這兩個,你需要JOIN在一起。但是,因爲有兩個查詢之間沒有實際的關係,它是一個笛卡爾乘積,我們將使用一個CROSS JOIN。實際上,該連接行與聚集查詢檢索單列主查詢。然後,您可以在SELECT列表進行運算,利用這兩個值:

所以,從你剛纔的問題查詢建築:

SELECT 
indxs.*, 
((indx_val2 - indx_val2_avg) * (indx_val1 - indx_val1_avg))/total_rows AS cv 
FROM (
    SELECT 
     dt, 
     MAX(CASE WHEN indx_nm = 'ABQI' THEN indx_nm ELSE NULL END) AS indx_nm1, 
     MAX(CASE WHEN indx_nm = 'ABQI' THEN indx_val ELSE NULL END) AS indx_val1, 
     MAX(CASE WHEN indx_nm = 'ACNACTR' THEN indx_nm ELSE NULL END) AS indx_nm2, 
     MAX(CASE WHEN indx_nm = 'ACNACTR' THEN indx_val ELSE NULL END) AS indx_val2 
    FROM table1 a 
    GROUP BY dt 
) indxs 
    CROSS JOIN (
    /* Join against a query returning the AVG() and COUNT() across all rows */ 
    SELECT 
     'ABQI' AS indx_nm1_aname, 
     AVG(CASE WHEN indx_nm = 'ABQI' THEN indx_val ELSE NULL END) AS indx_val1_avg, 
     'ACNACTR' AS indx_nm2_aname, 
     AVG(CASE WHEN indx_nm = 'ACNACTR' THEN indx_val ELSE NULL END) AS indx_val2_avg, 
     COUNT(*) AS total_rows 
    FROM table1 b 
    WHERE indx_nm IN ('ABQI','ACNACTR') 
    /* And it is a cartesian product */ 
) aggs 
WHERE 
    indx_nm1 IS NOT NULL 
    AND indx_nm2 IS NOT NULL 
ORDER BY dt 

這裏有一個演示,建立在你的前面一個:http://sqlfiddle.com/#!6/2ec65/14

+0

非常有幫助,你是救生員 - 非常感謝你! – euge1220

0

這裏是一個標量值的函數,在格式化爲XML的任意兩個列表執行協方差計算。

測試:編譯那麼函數執行阿爾法測試

CREATE Function [dbo].[Covariance](@XmlTwoValueSeries xml) 
    returns float 
    as 
    Begin 
    /* 

    -- ----------- 
    -- ALPHA TEST 
    -- ----------- 
    IF object_id('tempdb..#_201610101706') is not null DROP TABLE #_201610101706 
    select * 
    into #_201610101706 
    from 
    (
     select * 
     from 
     (
      SELECT '2016-01' Period, 1.24 col0, 2.20 col1 
      union 
      SELECT '2016-02' Period, 1.6 col0, 3.20 col1 
      union 
      SELECT '2016-03' Period, 1.0 col0, 2.77 col1 
      union 
      SELECT '2016-04' Period, 1.9 col0, 2.98 col1 
     ) A 
    ) A 


    DECLARE @XmlTwoValueSeries xml 
    SET @XmlTwoValueSeries = (
    SELECT col0,col1 FROM #_201610101706 
    FOR 
    XML PATH('Output') 
    ) 

    SELECT dbo.Covariance(@XmlTwoValueSeries) Covariance 

    */ 
    declare @returnvalue numeric(20,10) 

    set @returnvalue = 
    (
     SELECT SUM((x - xAvg) *(y - yAvg))/MAX(n) AS [COVAR(x,y)] 
     from 
     (
      SELECT 1E * x x, 
        AVG(1E * x) OVER (PARTITION BY (SELECT NULL)) xAvg, 
        1E * y y, 
        AVG(1E * y) OVER (PARTITION BY (SELECT NULL)) yAvg, 
        COUNT(*) OVER (PARTITION BY (SELECT NULL)) n 
      FROM  
      (
       SELECT 
        e.c.value('(col0/text())[1]', 'float') x, 
        e.c.value('(col1/text())[1]', 'FLOAT') y 
       FROM @XmlTwoValueSeries.nodes('Output') e(c)    
      ) A 
     ) A 
    ) 
    return @returnvalue 
    end 



    GO