2009-05-18 90 views
1

一個可能(工作)解決方法:

Private Sub ReadXMLAttributes(ByVal oXML As String) 
    ReadXMLAttributes(oXML, "mso-infoPathSolution") 
End Sub 
Private Sub ReadXMLAttributes(ByVal oXML As String, ByVal oTagName As String) 
    Try 
     Dim XmlDoc As New Xml.XmlDocument 
     XmlDoc.LoadXml(oXML) 
     oFileInfo = New InfoPathDocument 
     Dim XmlNodes As Xml.XmlNodeList = XmlDoc.GetElementsByTagName(oTagName) 
     For Each xNode As Xml.XmlNode In XmlNodes 
      With xNode 
       oFileInfo.SolutionVersion = .Attributes(InfoPathSolution.solutionVersion).Value 
       oFileInfo.ProductVersion = .Attributes(InfoPathSolution.productVersion).Value 
       oFileInfo.PIVersion = .Attributes(InfoPathSolution.PIVersion).Value 
       oFileInfo.href = .Attributes(InfoPathSolution.href).Value 
       oFileInfo.name = .Attributes(InfoPathSolution.name).Value 
      End With 
     Next 
    Catch ex As Exception 
     MsgBox(ex.Message, MsgBoxStyle.OkOnly, "ReadXMLAttributes") 
    End Try 
End Sub 

這工作,但它仍然會受到來自下面的問題,如果屬性被重新排序。我能想到的避免這個問題的唯一方法是將屬性名稱硬編碼到我的程序中,並讓它通過循環解析標籤並搜索指定標籤來處理條目。讀取XML標籤信息

注:InfoPathDocument是一個自定義類我做了,這是什麼複雜:

Public Class InfoPathDocument 
    Private _sVersion As String 
    Private _pVersion As String 
    Private _piVersion As String 
    Private _href As String 
    Private _name As String 
    Public Property SolutionVersion() As String 
     Get 
      Return _sVersion 
     End Get 
     Set(ByVal value As String) 
      _sVersion = value 
     End Set 
    End Property 
    Public Property ProductVersion() As String 
     Get 
      Return _pVersion 
     End Get 
     Set(ByVal value As String) 
      _pVersion = value 
     End Set 
    End Property 
    Public Property PIVersion() As String 
     Get 
      Return _piVersion 
     End Get 
     Set(ByVal value As String) 
      _piVersion = value 
     End Set 
    End Property 
    Public Property href() As String 
     Get 
      Return _href 
     End Get 
     Set(ByVal value As String) 
      If value.ToLower.StartsWith("file:///") Then 
       value = value.Substring(8) 
      End If 
      _href = Form1.PathToUNC(URLDecode(value)) 
     End Set 
    End Property 
    Public Property name() As String 
     Get 
      Return _name 
     End Get 
     Set(ByVal value As String) 
      _name = value 
     End Set 
    End Property 
    Sub New() 

    End Sub 
    Sub New(ByVal oSolutionVersion As String, ByVal oProductVersion As String, ByVal oPIVersion As String, ByVal oHref As String, ByVal oName As String) 
     SolutionVersion = oSolutionVersion 
     ProductVersion = oProductVersion 
     PIVersion = oPIVersion 
     href = oHref 
     name = oName 
    End Sub 
    Public Function URLDecode(ByVal StringToDecode As String) As String 
     Dim TempAns As String = String.Empty 
     Dim CurChr As Integer = 1 
     Dim oRet As String = String.Empty 
     Try 
      Do Until CurChr - 1 = Len(StringToDecode) 
       Select Case Mid(StringToDecode, CurChr, 1) 
        Case "+" 
         oRet &= " " 
        Case "%" 
         oRet &= Chr(Val("&h" & Mid(StringToDecode, CurChr + 1, 2))) 
         CurChr = CurChr + 2 
        Case Else 
         oRet &= Mid(StringToDecode, CurChr, 1) 
       End Select 
       CurChr += 1 
      Loop 
     Catch ex As Exception 
      MsgBox(ex.Message, MsgBoxStyle.OkOnly, "URLDecode") 
     End Try 
     Return oRet 
    End Function 
End Class 

原始的問題

我工作的一個項目,需要一個XML文檔的閱讀,尤其是保存來自Microsoft InfoPath的表單。

這裏是什麼,我將與一起正在與可能有幫助的一些背景資料,一個簡單的例子:

<?xml version="1.0" encoding="UTF-8"?> 
<?mso-infoPathSolution solutionVersion="1.0.0.2" productVersion="12.0.0" PIVersion="1.0.0.0" href="file:///C:\Users\darren\Desktop\simple_form.xsn" name="urn:schemas-microsoft-com:office:infopath:simple-form:-myXSD-2009-05-15T14-16-37" ?> 
<?mso-application progid="InfoPath.Document" versionProgid="InfoPath.Document.2"?> 
<my:myFields xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2009-05-15T14:16:37" xml:lang="en-us"> 
    <my:first_name>John</my:first_name> 
    <my:last_name>Doe</my:last_name> 
</my:myFields> 

我現在的目標是提取VERSIONID和形式的位置。與正則表達式很容易:

Dim _doc As New XmlDocument 
_doc.Load(_thefile) 
Dim oRegex As String = "^solutionVersion=""(?<sVersion>[0-9.]*)"" productVersion=""(?<pVersion>[0-9.]*)"" PIVersion=""(?<piVersion>[0-9.]*)"" href=""(?<href>.*)"" name=""(?<name>.*)""$" 
Dim rx As New Regex(oRegex), m As Match = Nothing 
For Each section As XmlNode In _doc.ChildNodes 
    m = rx.Match(section.InnerText.Trim) 
    If m.Success Then 
     Dim temp As String = m.Groups("name").Value.Substring(m.Groups("name").Value.ToLower.IndexOf("infopath") + ("infopath").Length + 1) 
     fileName = temp.Substring(0, temp.LastIndexOf(":")) 
     fileVersion = m.Groups("sVersion").Value 
    End If 
Next 

,這有效的解決方案帶來了唯一的問題是,如果在InfoPath文件頭中的架構更改...例如解決方案的版本和產品版本屬性交換位置(微軟喜歡做的事情像這樣,似乎)。

所以我選擇嘗試使用VB.NET的XML解析能力來幫助我實現上述結果sans-regex。

ChildNode從包含我需要的信息_doc對象,但它不具有任何的childNodes:

_doc.ChildNode(1).HasChildNodes = False 

誰能幫我這個?

回答

1

處理指令是XML文檔的一部分,但其屬性不會被解析。試試這個代碼:

// Load the original xml... 
var xml = new XmlDocument(); 
xml.Load(_thefile); 

// Select out the processing instruction... 
var infopathProcessingInstruction = xml.SelectSingleNode("/processing-instruction()[local-name(.) = \"mso-infoPathSolution\"]"); 

// Since the processing instruction does not expose it's attributes, create a new XML document... 
var xmlInfoPath = new XmlDocument(); 
xmlInfoPath.LoadXml("<data " + infopathProcessingInstruction.InnerText + " />"); 

// Get the data... 
var solutionVersion = xmlInfoPath.DocumentElement.GetAttribute("solutionVersion"); 
var productVersion = xmlInfoPath.DocumentElement.GetAttribute("productVersion"); 
+0

真棒,謝謝你! – Anders 2009-05-19 14:28:48

0

問題是您要解析的標籤實際上不是XML文檔的一部分。它們是包含處理指令的XML-Prolog。因此它們不會作爲元素在XmlDocument中可用。

我唯一的想法是(除了查看文檔如何訪問這些元素),在剝離<之後,僅將mso-infoPathSolution-元素移動到它自己的XmlDocument中? ? >,並用< />替換它們。然後你可以訪問這些屬性,而不管它們的順序。

+0

任何想法如何將這個特殊的節點進入一個新的XmlDocument?我對Xml解析和操作比較陌生。目前我正在嘗試修改newNode的OuterXml,但它是ReadOnly,因此任務仍在繼續! – Anders 2009-05-18 17:07:07