我想要獲取數據庫AA中數據庫AA中缺失的任何表或字段。我正在使用INFORMATION_SCHEMA.columns獲取信息。所以,我寫了一個'缺失記錄'查詢來找到它們。在測試中,我使用了2個數據庫,我知道BB在另一個表中有1個缺失的表和1個缺失的字段。
這是我第一次嘗試:確定兩個MySQL數據庫模式之間的差異
SELECT AA.table_name,
AA.column_name,
BB.table_name,
BB.column_name
FROM information_schema.columns AS AA
LEFT JOIN information_schema.columns AS BB
ON (AA.table_name = bb.table_name)
AND (AA.column_name = BB.column_name)
WHERE AA.table_schema = 'wireless-2015-05'
AND BB.table_schema = 'wireless-2015-04'
AND BB.column_name IS NULL
這返回0的記錄。所以,然後我嘗試:
SELECT AA.table_name,
AA.column_name
FROM information_schema.columns AS AA
WHERE AA.table_schema = 'wireless-2015-04'
AND NOT EXISTS(SELECT BB.table_name,
BB.column_name
FROM information_schema.columns AS BB
WHERE BB.table_schema = 'wireless-2015-05')
我再次得到0條記錄。最後我試過這個:
SELECT table_name,
column_name
FROM (SELECT DISTINCT table_name,
column_name
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-04'
UNION ALL
SELECT DISTINCT table_name,
column_name
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-05') AS tbl
GROUP BY table_name,
column_name
HAVING Count(*) = 1
這產生了預期的結果。
雖然我不介意使用第三個查詢,但我無法弄清楚爲什麼前兩個不起作用。我想知道以供將來參考。任何人都可以發現問題嗎?
更新:
對於那些感興趣的,這裏有4個查詢的工作,以及運行每一個的時間。按照最快的順序列出,並且在查詢下方列出時間。
SELECT AA.table_name,
AA.column_name
FROM information_schema.columns AS AA
LEFT JOIN (SELECT table_name,
column_name
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-04') BB
ON AA.table_name = BB.table_name
AND AA.column_name = BB.column_name
WHERE AA.table_schema = 'wireless-2015-05'
AND BB.table_name IS NULL;
0.047秒
SELECT table_name,
column_name
FROM (SELECT DISTINCT table_name,
column_name
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-04'
UNION ALL
SELECT DISTINCT table_name,
column_name
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-05') AS tbl
GROUP BY table_name,
column_name
HAVING Count(*) = 1;
0.078秒
SELECT DISTINCT table_name,
column_name,
Concat(table_name, '--', column_name) AS tc
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-05'
HAVING tc NOT IN(SELECT DISTINCT Concat(table_name, '--', column_name)
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-04');
0.125秒(一個新的解決方案,我認爲今天上午的)
SELECT aa.table_name,
aa.column_name
FROM information_schema.columns aa
WHERE table_schema = 'wireless-2015-05'
AND NOT EXISTS (SELECT 1
FROM information_schema.columns
WHERE table_schema = 'wireless-2015-04'
AND table_name = aa.table_name
AND column_name = aa.column_name);
44.382秒。顯然不是一個好的現實世界的解決方案。
information_schema對於查詢來說相對昂貴,因爲這些表並不是真實的,並且查詢經常檢查比查詢實際需要的更多的內部結構。這有助於解釋爲什麼第一個查詢更快 - 「LEFT JOIN(SELECT ...)BB'實際上創建了一個臨時表」BB「* first *,因此查詢中第二個表格實際上是在外部查詢運行之前完全填充,與最後顯示的非常緩慢的變體形成對比,這可能會針對每列向i_s發出請求。 –