2016-10-03 42 views
0

有沒有在SQL Server /事務處理SQL任何方式解析缺少的是去周圍的屬性引號,如(畸形)XML:是缺少屬性SQL Server分析(畸形)XML引用

SELECT CAST('<test A=B />' AS XML) 

上述失敗,:

XML分析:行1,字符9,字符串文字預計

而解析以下成功:

SELECT CAST('<test A="B" />' AS XML) 
+1

MSSQL預計有效的XML。您需要首先使XML有效。我還沒有看到用純SQL做這件事的方法,但如果你可以使用Java或C#... http://stackoverflow.com/questions/20125891/find-html-attributes-without-quotes-and-add-他們回到 – montewhizdoh

+0

https://github.com/MindTouch/SGMLReader – montewhizdoh

回答

1

來解決這個問題的正確方法是解決在源頭的XML被引用如果。 ,但是由於任何原因,這是不可能的,那麼你可以通過字符串分割函數和基本的字符串操作來修復xml。這種方法假設一個相對簡單的xml,並且對於大型或複雜的xml字符串可能不會很好。

首先,你將不得不創建一個字符串拆分功能,這裏有很多例子,但我有公司luded示例下面爲了完整性:

CREATE FUNCTION [dbo].[SplitString] 
( 
    @string NVARCHAR(MAX), 
    @delimiter CHAR(1) 
) 
RETURNS @output TABLE(splitdata NVARCHAR(MAX) 
) 
BEGIN 
    DECLARE @start INT, @end INT 
    SELECT @start = 1, @end = CHARINDEX(@delimiter, @string) 
    WHILE @start < LEN(@string) + 1 BEGIN 
     IF @end = 0 
      SET @end = LEN(@string) + 1 

     INSERT INTO @output (splitdata) 
     VALUES(SUBSTRING(@string, @start, @end - @start)) 
     SET @start = @end + 1 
     SET @end = CHARINDEX(@delimiter, @string, @start) 

    END 
    RETURN 
END 

接着拆分XML字符串爲多行的賦值運算符「=」的每次出現。然後使用模式匹配函數查找分配給屬性的值,並用引用值替換值。查詢中的最後一步將split xml字符串連接回單個xml。

DECLARE @malformedXmlString NVARCHAR(MAX) = '<test A=B width = 1000 height= 800 priority =high name="fred" />' --'<test A=BCD>DATA<\test>' 

DECLARE @xmlSplit TABLE 
(
    ID INT IDENTITY 
    ,splitdata NVARCHAR(MAX) 
) 

INSERT INTO @xmlSplit 
(
    splitdata 
) 
SELECT LTRIM(RTRIM(splitdata)) AS splitdata 
FROM [dbo].[SplitString](@malformedXmlString, '=') 



UPDATE @xmlSplit 
SET  splitdata = UpdatedXml.splitdata 
FROM @xmlSplit OrginalXml 
INNER JOIN (
       SELECT ID 
         -- Use the PATINDEX function to determine the position in the string where the attribute values end. Replace value with quoted version of value. 
         ,REPLACE(splitdata 
           ,LTRIM(RTRIM(LEFT(splitdata, PATINDEX('%[ />]%', splitdata) -1))) 
           ,'"' + LTRIM(RTRIM(LEFT(splitdata, PATINDEX('%[ />]%', splitdata) -1))) + '"') AS splitdata 
       FROM @xmlSplit 
       WHERE splitdata LIKE '[a-zA-Z0-9]%[ />]%' -- Only return occurrences of string which start with an alpha numeric character and ends with a space, ‘/’ or ‘>’ character. This should be your value of the attribute we split the string on. 
      ) UpdatedXml ON OrginalXml.ID = UpdatedXml.ID 


DECLARE @xmlString NVARCHAR(MAX); 

SELECT @xmlString = COALESCE(@xmlString + '=', '') + CONVERT(NVARCHAR(MAX), splitdata) 
FROM @xmlSplit 

SELECT CAST(@xmlString AS XML) 
1

您的假設是不正確的。你不能解析XML,因爲它不是XML。如果你讀XML規範,2.3 Common Syntactic Constructs,你會看到:

AttValue  ::= '"' ([^<&"] | Reference)* '"' 
       | "'" ([^<&'] | Reference)* "'" 

屬性必須以「或」