2010-11-02 55 views
2

儘量不選擇同一連接多次我曾嘗試使用CHARINDEX()= 0條件的下列方式的檢索集羣:遞歸查詢問題 - 針對連接

WITH Cluster(calling_party, called_party, link_strength, Path) 
AS 
(SELECT 
    calling_party, 
    called_party, 
    link_strength, 
    CONVERT(varchar(max), calling_party + '.' + called_party) AS Path 
FROM 
    monthly_connections_test 
WHERE 
    link_strength > 0.1 AND 
    calling_party = 'b' 
UNION ALL 
SELECT 
    mc.calling_party, 
    mc.called_party, 
    mc.link_strength, 
    CONVERT(varchar(max), cl.Path + '.' + mc.calling_party + '.' + mc.called_party) AS Path 
FROM 
    monthly_connections_test mc 
INNER JOIN Cluster cl ON 
    (
     mc.called_party = cl.called_party OR 
     mc.called_party = cl.calling_party 
    ) AND 
    (
     CHARINDEX(cl.called_party + '.' + mc.calling_party, Path) = 0 AND 
     CHARINDEX(cl.called_party + '.' + mc.called_party, Path) = 0 
    ) 
WHERE 
    mc.link_strength > 0.1 
) 
SELECT 
    calling_party, 
    called_party, 
    link_strength, 
    Path 
FROM 
    Cluster OPTION (maxrecursion 30000) 

條件但不符合其目的因爲多次選擇相同的行。

這裏的實際目標是檢索選定用戶(在示例用戶b中)所屬的整個連接集羣。

EDIT1:

我試圖修改查詢方式如下:

With combined_users AS 
(SELECT calling_party CALLING, called_party CALLED, link_strength FROM dbo.monthly_connections_test WHERE link_strength > 0.1), 
related_users1 AS 
(
SELECT c.CALLING, c.CALLED, c.link_strength, CONVERT(varchar(max), '.' + c.CALLING + '.' + c.CALLED + '.') path from combined_users c where CALLING = 'a1' 
UNION ALL 
SELECT c.CALLING, c.CALLED, c.link_strength, 
    convert(varchar(max),r.path + c.CALLED + '.') path 
     from combined_users c 
     join related_users1 r 
     ON (c.CALLING = r.CALLED) and CHARINDEX(c.CALLING + '.' + c.CALLED + '.', r.path)= 0 

     ), 
related_users2 AS 
(
SELECT c.CALLING, c.CALLED, c.link_strength, CONVERT(varchar(max), '.' + c.CALLING + '.' + c.CALLED + '.') path from combined_users c where CALLED = 'a1' 
UNION ALL 
SELECT c.CALLING, c.CALLED, c.link_strength, 
    convert(varchar(max),r.path + c.CALLING + '.') path 
     from combined_users c 
     join related_users2 r 
     ON c.CALLED = r.CALLING and CHARINDEX('.' + c.CALLING + '.' + c.CALLED, r.path)= 0 
) 
     SELECT CALLING, CALLED, link_strength, path FROM 
     (SELECT CALLING, CALLED, link_strength, path FROM related_users1 UNION SELECT CALLING, CALLED, link_strength, path FROM related_users2) r OPTION (MAXRECURSION 30000) 

爲了測試我創建了以下集羣查詢:

alt text

查詢上面回覆了下面的表格:

a1 a2 1.0000000 .a1.a2. 
a11 a13 1.0000000 .a12.a1.a13.a11. 
a12 a1 1.0000000 .a12.a1. 
a13 a12 1.0000000 .a12.a1.a13. 
a14 a13 1.0000000 .a12.a1.a13.a14. 
a15 a14 1.0000000 .a12.a1.a13.a14.a15. 
a2 a10 1.0000000 .a1.a2.a10. 
a2 a3 1.0000000 .a1.a2.a3. 
a3 a4 1.0000000 .a1.a2.a3.a4. 
a3 a6 1.0000000 .a1.a2.a3.a6. 
a4 a8 1.0000000 .a1.a2.a3.a4.a8. 
a4 a9 1.0000000 .a1.a2.a3.a4.a9. 

該查詢明顯地返回朝向所選節點和相反方向的連接的連接。問題在於方向的改變:例如,由於方向改變(相對於起始節點),未選擇連接a7,a4和a11,a10。

有誰知道如何修改查詢以包含所有連接?

謝謝

+0

你可以給一些樣本數據和你期望看到什麼嗎? – 2010-11-02 15:26:16

回答

1

好的,這裏有幾件事要討論。

Zerothly,我有PostgreSQL,所以這一切都完成了;我試圖只使用標準的SQL,所以這應該也適用於SQL Server。

首先,如果你只是在鏈接強度大於0.1來電興趣,讓我們說:

-- like calls, but only strong enough to be interesting 
create view strong_calls (calling_party, called_party, link_strength) 
as (
    select calling_party, called_party, link_strength 
    from monthly_connections_test 
    where link_strength > 0.1 
); 

,從現在起,我們將在此表方面談。

其次,你說:

實際這裏目的是檢索到所選擇的用戶(在本例中用戶B)所屬的連接的整個集羣。

如果這是真的,爲什麼你要計算路徑?如果你只是想知道組連接,你可以這樣做:

with recursive cluster (calling_party, called_party, link_strength) 
as (
    (
    select calling_party, called_party, link_strength 
    from strong_calls 
    where calling_party = 'b' 
) 
    union 
    (
    select c.calling_party, c.called_party, c.link_strength 
    from cluster this, strong_calls c 
    where c.calling_party = this.called_party 
    or c.called_party = this.calling_party 
) 
) 
select * 
from cluster; 

第三,也許你真的不想要查找連接集羣中,想要找到其中的人都在集羣中,以及從目標到他們的最短路徑是什麼。在這種情況下,您可以這樣做:

with recursive cluster (party, path) 
as (
    select cast('b' as character varying), cast('b' as character varying) 
    union 
    (
    select (case 
     when this.party = c.calling_party then c.called_party 
     when this.party = c.called_party then c.calling_party 
    end), (this.path || '.' || (case 
     when this.party = c.calling_party then c.called_party 
     when this.party = c.called_party then c.calling_party 
    end)) 
    from cluster this, strong_calls c 
    where (this.party = c.calling_party and position(c.called_party in this.path) = 0) 
    or (this.party = c.called_party and position(c.calling_party in this.path) = 0) 
) 
) 
select party, path 
from cluster 
where not exists (
    select * 
    from cluster c2 
    where cluster.party = c2.party 
    and (
    char_length(cluster.path) > char_length(c2.path) 
    or (char_length(cluster.path) = char_length(c2.path)) and (cluster.path > c2.path) 
) 
) 
order by party, path; 

正如您所看到的,您非常重視正確的方向。

如果你確實需要集羣中所有呼叫的列表和路徑,那麼,呃,我會盡快給你回覆!

編輯:請記住,不構建路徑的查詢將有非常不同的性能特點,以做那些。粗略地說,非路徑查詢將執行O(n)工作(可能在O(log n)迭代步驟中),因爲它們訪問集羣中的每個節點,但路徑構建步驟將做更多工作 - O也許吧? - 因爲他們必須通過圖訪問每個路徑。如果集羣與示例中的集羣一樣大,那麼你會好起來的,但是如果它們更大,則可能會發現運行時間過長。

0

CHARINDEX( 'b.d', 'b.c.d.b')= 0,因爲有一個 'C'。在更容易

之間

閱讀:

WITH cluster(calling_party, called_party, link_strength, PATH) 
    AS (SELECT calling_party, 
       called_party, 
       link_strength, 
       CONVERT(VARCHAR(MAX), calling_party + '.' + called_party) AS 
       PATH 
     FROM monthly_connections_test 
     WHERE link_strength > 0.1 
       AND calling_party = 'b' 
     UNION ALL 
     SELECT mc.calling_party, 
       mc.called_party, 
       mc.link_strength, 
       CONVERT(VARCHAR(MAX), cl.PATH + '.' + mc.calling_party + '.' + 
       mc.called_party) 
       AS PATH 
     FROM monthly_connections_test mc 
       INNER JOIN cluster cl 
        ON (mc.called_party = cl.called_party 
         OR mc.called_party = cl.calling_party) 
        AND (Charindex(cl.called_party + '.' + mc.calling_party, 
          PATH) 
          = 0 
          AND Charindex(cl.called_party + '.' + 
           mc.called_party, 
           PATH) 
           = 
           0) 
     WHERE mc.link_strength > 0.1) 
SELECT calling_party, 
     called_party, 
     link_strength, 
     PATH 
FROM cluster 
OPTION (MAXRECURSION 30000) 
+0

這與上面的查詢相同。 – 2010-11-02 15:11:15

+0

現在,是的,但是當我第一次閱讀它時,它全部在4行上,所以我重新格式化它以幫助每個人 – smirkingman 2010-11-02 16:41:52

0

爲了解決你的問題,編輯,如果你想忽略的鏈接指向,嘗試:

create view symmetric_users (calling_party, called_party, link_strength) 
as (
    select calling_party, called_party, link_strength from monthly_connections_test 
    union 
    select called_party , calling_party, link_strength from monthly_connections_test 
) 

然後在這一點上查詢。

如果您有相互呼叫的用戶,則每個有序對用戶將有兩行。你應該能夠通過選擇更強的過濾器來過濾掉。