2010-11-15 123 views
1

的monthly_connections表包含列calling_party, called_party, common_neighbors, neighborhood_overlap如何計算兩個用戶的常見鄰居並計算相似度?

所以表描述了用戶連接。一個用於用戶相似性的措施是其被定義爲以下附近重疊:

neighborhood_overlap =( 節點誰都是 calling_party和 called_pa​​rty的鄰居的數目)/(誰是 鄰居節點的數量 calling_party或called_pa​​rty)中的至少一個

嘗試計算共同的鄰居數兩個用戶我寫了下面的查詢:

SELECT 
COUNT (*) FROM 
(SELECT t1.neighborA 
    FROM (
      SELECT called_party AS neighborA FROM monthly_connections 
      WHERE calling_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1' 
      UNION SELECT calling_party AS neighborA FROM monthly_connections 
       WHERE called_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1') t1     
      INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections 
         WHERE calling_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2' 
         UNION SELECT calling_party AS neighborB FROM monthly_connections 
          WHERE called_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2') t2 ON t1.neighborA = t2.neighborB) t3 

上面的查詢計算用戶10D149A4356E1AA3A8AF604BD992BBA141DB53D2和9F7334BCF9000CD68D40302DC4801E60C027A7D1

目標是編寫查詢以設置列共同鄰居的對中的每對錶連接的值和附近重疊的共同鄰居的數目

有誰知道如何編寫查詢來更新列common_neighbors和neighborhood_overlap?

對於普通的鄰居,我開始寫下面的查詢,但它是不正確的:

UPDATE mc SET 
    common_neighbors = 
    (SELECT COUNT (*) FROM 
(SELECT t1.neighborA FROM (SELECT called_party AS neighborA FROM monthly_connections WHERE calling_party = mc.calling_party UNION SELECT calling_party AS neighborA FROM monthly_connections WHERE called_party = mc.calling_party) t1 INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections WHERE calling_party = mc.called_party UNION SELECT calling_party AS neighborB FROM monthly_connections WHERE called_party = mc.called_party) t2 ON t1.neighborA = t2.neighborB) t3) FROM monthly_connections mc INNER JOIN t3 ON t3.calling_party = mc.calling_party AND t3.called_party = mc.called_party 
+0

當你學會接受'最佳'答案而不僅僅是'完美'答案時,你會發現人們更願意花時間在你的問題上 – smirkingman 2010-11-17 11:05:37

回答

1

我覺得這個查詢工作(儘管可能不是高性能的)。

UPDATE mc 
    SET common_neighbors = (SELECT COUNT (*) FROM 
    (
     (SELECT called_party FROM monthly_connections 
     WHERE calling_party = mc.calling_party 
     UNION 
     SELECT calling_party FROM monthly_connections 
     WHERE called_party = mc.calling_party 
    ) 
     INTERSECT 
     (SELECT calling_party FROM monthly_connections 
     WHERE called_party = mc.called_party 
     UNION 
     SELECT called_party FROM monthly_connections 
     WHERE calling_party = mc.called_party 
    ) 
    ) t1 
    ) FROM monthly_connections mc