2014-10-31 71 views
5

我有一個SQLCLR標量函數,它會將我需要的XmlReader蒸發到內聯結果集中。這些XML對象是按需生成的,所以我不能使用XML索引。在結果數據集中有超過100列是很常見的。考慮下面的示例代碼:XQuery計劃複雜性

CREATE XML SCHEMA COLLECTION RAB AS ' 
<xsd:schema xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified"> 
<xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" /> 

    <xsd:element name="r" type="r"/> 

    <xsd:complexType name="r"> 
    <xsd:attribute name="a" type="sqltypes:int" use="required"/> 
    <xsd:attribute name="b" type="sqltypes:int" use="required"/> 
    <xsd:attribute name="c" type="sqltypes:int" use="required"/> 
    </xsd:complexType> 
</xsd:schema>'; 
GO 

DECLARE @D TABLE(x XML(DOCUMENT RAB) NOT NULL); 

INSERT INTO @D 
VALUES 
('<r a="3" b="4" c="34"/>'), 
('<r a="5" b="6" c="56"/>'), 
('<r a="7" b="8" c="78"/>') 

SELECT x.value('/r/@a', 'int') a, x.value('/r/@b', 'int') b, x.value('/r/@c', 'int') c 
FROM @d a 

這與一些XML值的表變量填充類型化XML列,打破了屬性爲單獨列。執行計劃似乎過於凌亂:

|--Compute Scalar(DEFINE:([Expr1009]=[Expr1008], [Expr1016]=[Expr1015], [Expr1023]=[Expr1022])) 
    |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x])) 
     |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x])) 
      | |--Nested Loops(Inner Join, OUTER REFERENCES:([a].[x])) 
      | | |--Table Scan(OBJECT:(@d AS [a])) 
      | | |--Stream Aggregate(DEFINE:([Expr1008]=MIN([Expr1024]))) 
      | |   |--Compute Scalar(DEFINE:([Expr1024]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader wi 
      | |    |--Table-valued function 
      | |--Stream Aggregate(DEFINE:([Expr1015]=MIN([Expr1025]))) 
      |   |--Compute Scalar(DEFINE:([Expr1025]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader with XP 
      |    |--Table-valued function 
      |--Stream Aggregate(DEFINE:([Expr1022]=MIN([Expr1026]))) 
       |--Compute Scalar(DEFINE:([Expr1026]=CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(XML Reader with XPath filter.[value],XML Reader with XPath filter.[lvalue],XML Reader with XPath f 
        |--Table-valued function 

它的每列都有一個嵌套循環!如果我將這些表中的多個列連接在一起,那麼查詢計劃將過於複雜。另外,我不明白這些運營商的目的。內容如下:

MIN(
    CASE WHEN @d.[x] as [a].[x] IS NULL 
    THEN NULL ELSE 
    CASE WHEN datalength(CONVERT_IMPLICIT(sql_variant, 
      CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)) 
      ,0),0))>=(128) 
    THEN CONVERT_IMPLICIT(int, 
     CASE WHEN datalength(xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)))<(128) 
     THEN NULL 
     ELSE xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)) 
     END,0) 
     ELSE CONVERT_IMPLICIT(int, 
      CONVERT_IMPLICIT(sql_variant,CONVERT_IMPLICIT(nvarchar(64),xsd_cast_to_maybe_large(xrpf.[value],xrpf.[lvalue],xrpf.[lvaluebin],xrpf.[tid],(15),(7)),0),0),0) 
    END 
END) 

Yu!我認爲使用類型爲sqltype的XML組應該避免轉換?

要麼我高估了這將是多麼有效,或者我做錯了什麼。我的問題是我該如何解決這個問題,因此我沒有爲每列添加額外的查詢計劃運算符,並且理想地避免了轉換,還是應該放棄並找到非xpath方法來執行此操作?

參考文獻:

的SQLType http://msdn.microsoft.com/en-us/library/ee320775%28v=sql.105%29.aspx

XML數據類型方法http://technet.microsoft.com/en-us/library/ms190798%28v=sql.105%29.aspx

enter image description here

+1

通常的方法我做的是'SELECT r.value( '@一', '廉政'),r.value('@ b','int'),r.value('@ c','int')FROM @D CROSS APPLY x.nodes('/ r')as ca(r)',但是這個計劃看起來形狀差不多(TVF節點給出了更高的估計子樹成本) – 2014-11-01 12:06:13

回答

11

有在需要先整理了一下查詢計劃的一些奧祕。計算標量做什麼以及爲什麼會有一個流聚合。

表值函數返回碎化XML的節點表,每個碎化行一行。當您使用鍵入的XML時,這些列是值,左值,左值和tid。這些列用於計算標量來計算實際值。在那裏的代碼看起來有點奇怪,我不能說我明白爲什麼它是這樣的,但它的要點是函數xsd_cast_to_maybe_large返回值,並且有代碼處理的情況下,當值等於和大於128個字節。

CASE WHEN datalength(
        CONVERT_IMPLICIT(sql_variant, 
         CONVERT_IMPLICIT(nvarchar(64), 
             xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                   XML Reader with XPath filter.[lvalue], 
                   XML Reader with XPath filter.[lvaluebin], 
                   XML Reader with XPath filter.[tid],(15),(5),(0)),0),0))>=(128) 
    THEN CONVERT_IMPLICIT(int,CASE WHEN datalength(xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                     XML Reader with XPath filter.[lvalue], 
                     XML Reader with XPath filter.[lvaluebin], 
                     XML Reader with XPath filter.[tid],(15),(5),(0)))<(128) 
           THEN NULL 
           ELSE xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                  XML Reader with XPath filter.[lvalue], 
                  XML Reader with XPath filter.[lvaluebin], 
                  XML Reader with XPath filter.[tid],(15),(5),(0)) 
          END,0) 
    ELSE CONVERT_IMPLICIT(int,CONVERT_IMPLICIT(sql_variant, 
               CONVERT_IMPLICIT(nvarchar(64), 
                   xsd_cast_to_maybe_large(XML Reader with XPath filter.[value], 
                         XML Reader with XPath filter.[lvalue], 
                         XML Reader with XPath filter.[lvaluebin], 
                         XML Reader with XPath filter.[tid],(15),(5),(0)),0),0),0) 
END 

用於非類型化XML的相同計算標量非常簡單且實際上可以理解。

CASE WHEN datalength(XML Reader with XPath filter.[value])>=(128) 
    THEN CONVERT_IMPLICIT(int,XML Reader with XPath filter.[lvalue],0) 
    ELSE CONVERT_IMPLICIT(int,XML Reader with XPath filter.[value],0) 
END 

如果有value超過128個字節lvalue獲取來自其他value取。在非類型化XML的情況下,返回的節點表僅輸出列id,值和左值。

當您使用鍵入的XML時,節點值的存儲將根據模式中指定的數據類型進行優化。看起來它可能會以節點表中的值,左值或左值結束,具體取決於它的值是什麼類型,xsd_cast_to_maybe_large可以幫助解決問題。

流聚合對來自計算標量的返回值執行min()。我們知道,SQL Server(至少有時)確實知道,在value()函數中指定XPath時,將只有從表值函數返回的一行。解析器確保我們正確地構建XPath,但是當查詢優化器查看估計的行時,它會看到200行。用於解析XML的表值函數的基本估計值爲10000行,然後使用所使用的XPath進行一些調整。在這種情況下,它最終只有200行,只有一行。對我而言,純粹的猜測是流集合在那裏來處理這種差異。它永遠不會聚合任何東西,只發送返回的那一行,但它確實會影響整個分支的基數估計,並確保優化器使用1行作爲該分支的估計值。當優化器選擇加入策略等時,這當然非常重要。

那麼100個屬性如何呢?是的,如果您使用值函數100次,將會有100個分支。但是這裏有一些優化要做。我創建了一個測試裝置,查看使用10行以上的100個屬性,查詢的形狀和形式是最快的。

獲勝者將使用無類型的XML,並且不要使用nodes()函數粉碎r

select X.value('(/r/@a1)[1]', 'int') as a1, 
     X.value('(/r/@a2)[1]', 'int') as a2, 
     X.value('(/r/@a3)[1]', 'int') as a3 
from @T 

還有避免了100個分支機構使用透視但根據您的實際查詢的樣子,可能是不可能的方式。來自數據透視表的數據類型必須相同。你當然可以將它們作爲字符串提取出來,並轉換爲列表中適當的類型。它還要求您的表具有主鍵/唯一鍵。

select a1, a2, a3 
from (
    select T.ID, -- primary key of @T 
      A.X.value('local-name(.)', 'nvarchar(50)') as Name, 
      A.X.value('.', 'int') as Value 
    from @T as T 
     cross apply T.X.nodes('/r/@*') as A(X) 
    ) as T 
pivot(min(T.Value) for Name in (a1, a2, a3)) as P 

查詢計劃樞軸查詢,10點100的屬性:

enter image description here

下面是結果和測試裝備予使用。我測試了100個屬性和10行以及所有int屬性。

結果:

Test            Duration (ms) 
-------------------------------------------------- ------------- 
untyped XML value('/r[1]/@a')      195  
untyped XML value('(/r/@a)[1]')      108 
untyped XML value('@a') cross apply nodes('/r')  131 
untyped XML value('@a') cross apply nodes('/r[1]') 127 
typed XML value('/r/@a')       185 
typed XML value('(/r/@a)[1]')      148 
typed XML value('@a') cross apply nodes('/r')  176 
untyped XML pivot         34 
typed XML pivot          52 

代碼:

drop type dbo.TRABType 
drop type dbo.TType; 
drop xml schema collection dbo.RAB; 

go 

declare @NumAtt int = 100; 
declare @Attribs nvarchar(max); 

with xmlnamespaces('http://www.w3.org/2001/XMLSchema' as xsd) 
select @Attribs = (
select top(@NumAtt) 'a'+cast(row_number() over(order by 1/0) as varchar(11)) as '@name', 

        'sqltypes:int' as '@type', 
        'required' as '@use' 
from sys.columns 
for xml path('xsd:attribute') 
) 
--CREATE XML SCHEMA COLLECTION RAB AS 

declare @Schema nvarchar(max) = 
' 
<xsd:schema xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sqltypes="http://schemas.microsoft.com/sqlserver/2004/sqltypes" elementFormDefault="qualified"> 
<xsd:import namespace="http://schemas.microsoft.com/sqlserver/2004/sqltypes" schemaLocation="http://schemas.microsoft.com/sqlserver/2004/sqltypes/sqltypes.xsd" /> 
    <xsd:element name="r" type="r"/> 
    <xsd:complexType name="r">[ATTRIBS]</xsd:complexType> 
</xsd:schema>'; 

set @Schema = replace(@Schema, '[ATTRIBS]', @Attribs) 

create xml schema collection RAB as @Schema 

go 

create type dbo.TType as table 
(
    ID int identity primary key, 
    X xml not null 
); 

go 

create type dbo.TRABType as table 
(
    ID int identity primary key, 
    X xml(document rab) not null 
); 


go 

declare @NumAtt int = 100; 
declare @NumRows int = 10; 

declare @X nvarchar(max); 
declare @C nvarchar(max); 
declare @M nvarchar(max); 

declare @S1 nvarchar(max); 
declare @S2 nvarchar(max); 
declare @S3 nvarchar(max); 
declare @S4 nvarchar(max); 
declare @S5 nvarchar(max); 
declare @S6 nvarchar(max); 
declare @S7 nvarchar(max); 
declare @S8 nvarchar(max); 
declare @S9 nvarchar(max); 

set @X = N'<r '+ 
    (
    select top(@NumAtt) 'a'+cast(row_number() over(order by 1/0) as varchar(11))+'="'+cast(row_number() over(order by 1/0) as varchar(11))+'" ' 
    from sys.columns 
    for xml path('') 
)+ 
'/>'; 

set @C = 
    stuff((
    select top(@NumAtt) ',a'+cast(row_number() over(order by 1/0) as varchar(11)) 
    from sys.columns 
    for xml path('') 
), 1, 1, '') 

set @M = 
    stuff((
    select top(@NumAtt) ',MAX(CASE WHEN name = ''a'+cast(row_number() over(order by 1/0) as varchar(11))+''' THEN val END)' 
    from sys.columns 
    for xml path('') 
), 1, 1, '') 


declare @T dbo.TType; 
insert into @T(X) 
select top(@NumRows) @X 
from sys.columns; 

declare @TRAB dbo.TRABType; 
insert into @TRAB(X) 
select top(@NumRows) @X 
from sys.columns; 


-- value('/r[1]/@a') 
set @S1 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''/r[1]/@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
option (maxdop 1)'; 

-- value('(/r/@a)[1]') 
set @S2 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''(/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+')[1]'', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
option (maxdop 1)'; 

-- value('@a') cross apply nodes('/r') 
set @S3 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
    cross apply T.X.nodes(''/r'') as T2(X) 
option (maxdop 1)'; 


-- value('@a') cross apply nodes('/r[1]') 
set @S4 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @T as T 
    cross apply T.X.nodes(''/r[1]'') as T2(X) 
option (maxdop 1)'; 

-- value('/r/@a') typed XML 
set @S5 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @TRAB as T 
option (maxdop 1)'; 

-- value('(/r/@a)[1]') 
set @S6 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T.X.value(''(/r/@a'+cast(row_number() over(order by 1/0) as varchar(11))+')[1]'', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @TRAB as T 
option (maxdop 1)'; 

-- value('@a') cross apply nodes('/r') typed XML 
set @S7 = N' 
select T.ID'+ 
(
select top(@NumAtt) ', T2.X.value(''@a'+cast(row_number() over(order by 1/0) as varchar(11))+''', ''int'')' 
from sys.columns 
for xml path('') 
)+ 
' from @TRAB as T 
    cross apply T.X.nodes(''/r'') as T2(X) 
option (maxdop 1)'; 

-- pivot 
set @S8 = N' 
select ID, '[email protected]+' 
from (
    select T.ID, 
      A.X.value(''local-name(.)'', ''nvarchar(50)'') as Name, 
      A.X.value(''.'', ''int'') as Value 
    from @T as T 
     cross apply T.X.nodes(''/r/@*'') as A(X) 
    ) as T 
pivot(min(T.Value) for Name in ('[email protected]+')) as P 
option (maxdop 1)'; 

-- typed pivot 
set @S9 = N' 
select ID, '[email protected]+' 
from (
    select T.ID, 
      A.X.value(''local-name(.)'', ''nvarchar(50)'') as Name, 
      cast(cast(A.X.query(''string(.)'') as varchar(11)) as int) as Value 
    from @TRAB as T 
     cross apply T.X.nodes(''/r/@*'') as A(X) 
    ) as T 
pivot(min(T.Value) for Name in ('[email protected]+')) as P 
option (maxdop 1)'; 


exec sp_executesql @S1, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S2, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S3, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S4, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S5, N'@TRAB dbo.TRABType readonly', @TRAB; 
exec sp_executesql @S6, N'@TRAB dbo.TRABType readonly', @TRAB; 
exec sp_executesql @S7, N'@TRAB dbo.TRABType readonly', @TRAB; 
exec sp_executesql @S8, N'@T dbo.TType readonly', @T; 
exec sp_executesql @S9, N'@TRAB dbo.TRABType readonly', @TRAB; 
+2

+1詳細和信息豐富的配置文件替代品!我已經考慮過PIVOT選項,可能直接在CLR中直接生成鍵/名稱/值元組。我希望使用XML來提高效率。這是使這麼好的答案! – 2014-11-03 14:30:41