我們正在努力將現有應用程序移植到Azure SQL數據倉庫。爲了更好地理解Azure SQL數據倉庫的性能/工作負載管理特性/功能,我設置了我認爲是非常簡單的測試。Azure SQL數據倉庫的簡單性能測試
我加載了一個包含大約20k行(即對於並行數據倉庫非常小)的靜態表,即我們的業務日曆。
SELECT current_timestamp,COUNT(1) FROM
(SELECT C1, ..., Cn , COUNT(1) AS _A_ROW_COUNT
FROM schema.view_to_table GROUP BY C1, ..., Cn) DER
吉文斯:然後我使用圖案等生成該單個表的所有可能的查詢
- DWU設置爲1000推出
- 35個併發線程。
- 在small_rc中運行的所有線程。 (即,每個查詢使用1個插槽)
- 對初始連接使用sqlcmd,然後在每次SELECT後執行
- 在通過Express路由連接的非Azure VM上運行。選擇外部SELECT COUNT()結構以確保網絡流量最小。
- 堆表的使用提供了比默認列存儲更好的結果(如預期的那樣)。 (需要使用聚簇索引進行測試。)
- 表由主鍵列分配。
背景/偏差 - 我曾與許多其他MPP數據庫。
結果
- 查詢在10-20秒,這是更長的時間比我預計這種簡單的工作運行。
- 當我提交每個線程時,我睡在每個新線程之間。最初的查詢運行得更快,並且隨着線程數量增加到35,平均運行時間顯着惡化。
我如何理解存在哪些瓶頸?
當然我會在其他DWU設置下重新運行測試,看看它是否會影響完全small_rc的工作負載。
附錄 - 示例查詢計劃
<?xml version="1.0" encoding="utf-8"?>
<dsql_query number_nodes="10" number_distributions="60" number_distributions_per_node="6">
<sql>SELECT current_timestamp,COUNT(1) FROM (SELECT GREGORIAN_DATE, WM_MONTH, MON_MULT, FRI_MULT , COUNT(1) AS _A_ROW_COUNT FROM AR_WM_VM.CALENDAR_DAY GROUP BY GREGORIAN_DATE, WM_MONTH, MON_MULT, FRI_MULT) DER</sql>
<dsql_operations total_cost="0.260568" total_number_operations="8">
<dsql_operation operation_type="RND_ID">
<identifier>TEMP_ID_21523</identifier>
</dsql_operation>
<dsql_operation operation_type="ON">
<location permanent="false" distribution="AllDistributions" />
<sql_operations>
<sql_operation type="statement">CREATE TABLE [tempdb].[dbo].[TEMP_ID_21523] ([col] DATE) WITH(DATA_COMPRESSION=PAGE);</sql_operation>
</sql_operations>
</dsql_operation>
<dsql_operation operation_type="SHUFFLE_MOVE">
<operation_cost cost="0.258648" accumulative_cost="0.258648" average_rowsize="3" output_rows="2155.4" GroupNumber="76" />
<source_statement>SELECT [T1_1].[col] AS [col]
FROM (SELECT dateadd(dd, CAST ((364) AS INT), [T2_1].[calendar_date]) AS [col]
FROM [db_ARdev1].[AR_CORE_DIM_TABLES].[calendar_dim] AS T2_1) AS T1_1</source_statement>
<destination_table>[TEMP_ID_21523]</destination_table>
<shuffle_columns>col;</shuffle_columns>
</dsql_operation>
<dsql_operation operation_type="ON">
<location permanent="false" distribution="Control" />
<sql_operations>
<sql_operation type="statement">CREATE TABLE [tempdb].[QTables].[QTable_3ff26b5253004eec9d9ca50492bab1e2] ([col] BIGINT) WITH(DATA_COMPRESSION=PAGE);</sql_operation>
</sql_operations>
</dsql_operation>
<dsql_operation operation_type="PARTITION_MOVE">
<operation_cost cost="0.00192" accumulative_cost="0.260568" average_rowsize="8" output_rows="1" GroupNumber="93" />
<location distribution="AllDistributions" />
<source_statement>SELECT [T1_1].[col] AS [col]
FROM (SELECT COUNT_BIG(CAST ((1) AS INT)) AS [col]
FROM (SELECT 0 AS [col]
FROM [tempdb].[dbo].[TEMP_ID_21523] AS T3_1
INNER JOIN
(SELECT CASE
WHEN ([T4_1].[wm_week_day_nbr] = CAST ((3) AS SMALLINT)) THEN CAST ((1) AS INT)
ELSE CAST ((0) AS INT)
END AS [col],
CASE
WHEN ([T4_1].[wm_week_day_nbr] = CAST ((7) AS SMALLINT)) THEN CAST ((1) AS INT)
ELSE CAST ((0) AS INT)
END AS [col1],
[T4_1].[calendar_date] AS [calendar_date],
[T4_1].[fiscal_month_nbr] AS [fiscal_month_nbr]
FROM [db_ARdev1].[AR_CORE_DIM_TABLES].[calendar_dim] AS T4_1) AS T3_2
ON ([T3_2].[calendar_date] = [T3_1].[col])
GROUP BY [T3_2].[calendar_date], [T3_2].[fiscal_month_nbr], [T3_2].[col], [T3_2].[col1]) AS T2_1
GROUP BY [T2_1].[col]) AS T1_1</source_statement>
<destination>Control</destination>
<destination_table>[QTable_3ff26b5253004eec9d9ca50492bab1e2]</destination_table>
</dsql_operation>
<dsql_operation operation_type="ON">
<location permanent="false" distribution="AllDistributions" />
<sql_operations>
<sql_operation type="statement">DROP TABLE [tempdb].[dbo].[TEMP_ID_21523]</sql_operation>
</sql_operations>
</dsql_operation>
<dsql_operation operation_type="RETURN">
<location distribution="Control" />
<select>SELECT [T1_1].[col1] AS [col],
[T1_1].[col] AS [col1]
FROM (SELECT CONVERT (INT, [T2_1].[col], 0) AS [col],
isnull(CONVERT (DATETIME, N'2016-10-03 13:04:34.203', 0), CONVERT (DATETIME, N'2016-10-03 13:04:34.203', 0)) AS [col1]
FROM (SELECT ISNULL([T3_1].[col], CONVERT (BIGINT, 0, 0)) AS [col]
FROM (SELECT SUM([T4_1].[col]) AS [col]
FROM [tempdb].[QTables].[QTable_3ff26b5253004eec9d9ca50492bab1e2] AS T4_1) AS T3_1) AS T2_1) AS T1_1</select>
</dsql_operation>
<dsql_operation operation_type="ON">
<location permanent="false" distribution="Control" />
<sql_operations>
<sql_operation type="statement">DROP TABLE [tempdb].[QTables].[QTable_3ff26b5253004eec9d9ca50492bab1e2]</sql_operation>
</sql_operations>
</dsql_operation>
</dsql_operations>
</dsql_query>
你可以在SQL查詢前面加入'EXPLAIN'並運行它,然後將XML查詢計劃發佈到這個線程中嗎? – GregGalloway