2012-02-22 34 views
1

我想在不同的DB中的兩個表之間傳輸一些數據但是源表在PostalCode列中有一些重複值,我在PostalCode列創建了與英國的目標表,腳本需要同步檢查,未插入前插入一個值,這是我的示例腳本:在兩個數據庫之間傳輸數據並同步檢查不重複的值

INSERT INTO [Target] 
(
    [FirstName], 
    [LastName], 
    [PostalCode], 
) 
(
SELECT 
[Sc].[FirstName], 
[Sc].[LastName], 
CASE 
WHEN 'Check for not repeated before' THEN [Sc].[PostalCode] 
ELSE CAST(1000000000 + ROW_NUMBER() OVER(ORDER BY [Sc].[FirstName]) AS CHAR(10)) END 

FROM [Source] AS [Sc] 
); 

那麼,什麼是您的建議來處理呢?

編輯

,是有沒有辦法寫一個腳本用於或光標?我的意思是異步檢查重複的值?

回答

1

我強烈反對mixxing兩條信息到一個單一的領域。

相反,只是有一個額外的列,可能被稱爲DuplicationID

INSERT INTO [Target] 
(
    [FirstName], 
    [LastName], 
    [PostalCode], 
    [DuplicationID] 
) 
SELECT 
    [Sc].[FirstName], 
    [Sc].[LastName], 
    [Sc].[PostalCode], 
    ROW_NUMBER() OVER (PARTITION BY [Sc].[PostalCode] ORDER BY [Sc].[PostalCode]) 
FROM 
    [Soruce] AS [Sc] 

任何DuplicationID爲1的記錄都被計爲該郵編的第一個實例。任何其他值都是重複的。

+0

我嘗試你的答案,並有一個錯誤:'函數'ROW_NUMBER'必須有ORDER BY OVER子句。' – Saeid 2012-02-22 12:34:16

+0

@Saeid - 你可以添加你喜歡的任何ORDER BY。我添加了「PostalCode」,它基本上什麼都不做(如果按PostalCode進行分區,則按照定義,組中的每條記錄都是相同的)。您可以決定按姓氏或任何其他字段劃分優先次序。它只是控制哪些獲得DuplicateID 1,哪些獲得DuplicateID2等。使用它並查看哪些最適合您。 – MatBailie 2012-02-22 13:00:38

0

也許是這樣的:

;WITH CTE AS 
(
    SELECT 
     COUNT(*) OVER(PARTITION BY [PostalCode]) AS NbrOf, 
     ROW_Number() OVER 
        (
         PARTITION BY [PostalCode] 
         ORDER BY [PostalCode] 
       ) AS RowNbr 
     [FirstName], 
     [LastName], 
     [PostalCode], 
    FROM 
     [Source] AS [Sc] 
) 
INSERT INTO [Target] 
(
    [FirstName], 
    [LastName], 
    [PostalCode], 
) 
SELECT 
    [Sc].[FirstName], 
    [Sc].[LastName], 
    CASE 
     WHEN CTE.NbrOf>1 
     THEN CAST(1000000000+CTE.RowNbr AS VARCHAR(10)) 
     ELSE [Sc].[PostalCode] 
    END 
FROM 
    CTE 
相關問題