2012-07-20 96 views
0

我有兩個巨大的1 GB XML文件。兩者具有相同的結構。我試圖合併它們。 腳本使用xmltextreader和xmltextwriter.it工作正常,除了它將命名空間複製到多個節點。我閱讀了很多博客和文檔,但沒有找到合適的解決方案。 任何想法或幫助真的appriciated。如何從xml中的節點中刪除重複的名稱空間屬性?

對於測試puspose,我只是從下面的XML閱讀和寫入新的XML文件。 在輸出文件tittle節點有這個額外的命名空間,我不想要的。

下面是我的示例xml文件。

<?xml version="1.0" encoding="utf-8"?> 
<records xmnls:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="sample.xsd"> 
<record category="xyz" editor="" entered="sdsd" sub-category="sds" uid="ds" updated="sd-07-15"> 
    <person ssn="" e-i="M"> 
     <title xsi:nil="true"/> 
     <position>abcd</position> 
     <names> 
     <first_name>xyz</first_name> 
     <last_name>xyz</last_name> 
     </names> 
    </person> 
</record> 
<record category="xyz" editor="" entered="sdsd" sub-category="sds" uid="ds" updated="sd-07-15"> 
    <person ssn="" e-i="M"> 
     <title xsi:nil="true"/> 
     <position>abcd</position> 
     <names> 
     <first_name>xyz</first_name> 
     <last_name>xyz</last_name> 
     </names> 
    </person> 
</record> 
</records> 

my code is as below 

     Public Sub Main() 
     Dim DownloadPEPLocation As String = Dts.Variables("xyz").Value 
     Dim ACTIMIZESource As String = Dts.Variables("ACTIMIZESource").Value 
     Dim PEPTextReader As Xml.XmlTextReader 
     Dim Destination As Xml.XmlTextWriter 
     Destination = New Xml.XmlTextWriter(ACTIMIZESource, System.Text.Encoding.UTF8) 
     Destination.Formatting = Formatting.Indented 
     Destination.Namespaces = True 

     PEPTextReader = New XmlTextReader(DownloadPEPLocation) 
     PEPTextReader.WhitespaceHandling = WhitespaceHandling.None 

     Destination.WriteStartDocument() 
     Destination.WriteStartElement("records") 

     Destination.WriteAttributeString("xmnls:xsi", "http://www.w3.org/2001/XMLSchema-instance") 
     Destination.WriteAttributeString("xsi:noNamespaceSchemaLocation", "world-check.xsd") 

     Dim PEPreading As Boolean = PEPTextReader.Read() 
     Do While (PEPreading) 
      If (PEPTextReader.NodeType = XmlNodeType.Element And PEPTextReader.LocalName = "record") Then 
       Destination.WriteNode(PEPTextReader, True) 
       Destination.Flush() 
      Else 
       PEPreading = PEPTextReader.Read() 
      End If 
     Loop 

     Destination.WriteEndElement() 
     Destination.WriteEndDocument() 
     Destination.Close() 
     PEPTextReader.Close() 


Output is look like this. 

<?xml version="1.0" encoding="utf-8"?> 
<records xmnls:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="sample.xsd"> 
<record category="xyz" editor="" entered="sdsd" sub-category="sds" uid="ds" updated="sd-07-15"> 
    <person ssn="" e-i="M"> 
     <title xsi:nil="true" **xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"** /> 
     <position>abcd</position> 
     <names> 
     <first_name>xyz</first_name> 
     <last_name>xyz</last_name> 
     </names> 
    </person> 
</record> 
<record category="xyz" editor="" entered="sdsd" sub-category="sds" uid="ds" updated="sd-07-15"> 
    <person ssn="" e-i="M"> 
     <title xsi:nil="true" **xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"** /> 
     <position>abcd</position> 
     <names> 
     <first_name>xyz</first_name> 
     <last_name>xyz</last_name> 
     </names> 
    </person> 
</record> 
</records> 

` 

回答

0

@Tapan:根據您的輸入和輸出的例子,看來兩封信已在<records>根元素被無意中調換的xmlns屬性:

<records xmnls:xsi="http://www.w3.org/2001/XMLSchema-instance" 
     ^^^^^ 

屬性讀取xmnls代替xmlns。因此,xsi名稱空間前綴未按照您認爲的方式進行定義。

請嘗試在輸入文件中進行此更改,以查看輸出文件中明顯多餘的xsi屬性是否消失。

+0

感謝您的更新。我做了更改,但仍然沒有工作。任何想法如何刪除重複的命名空間。 – 2012-07-22 03:46:23

相關問題