2015-05-14 41 views
0

我有以下場景: 我有成千上萬的文本文件,格式如下。列名分別寫在單獨的行中,行值由Pipe(|)分隔。 。如何使用腳本組件過濾平面文件源

START-OF-FILE 
PROGRAMNAME=getdata 
DATEFORMAT=yyyymmdd 

#Some Text 
#Some Text 
#Some Text 
#Some Text 
#Some Text 
START-OF-FIELDS 
Field1 
Field2 
Field3 
------ 
FieldN 
END-OF-FIELDS 
TIMESTARTED=Tue May 12 16:04:42 JST 2015 
START-OF-DATA 
Field1Value|Field2value|Field3Value|...|Field N Value 
Field1Value|Field2value|Field3Value|...|Field N Value 
------|...........|----|------- 
END-OF-DATA 
DATARECORDS=30747 
TIMEFINISHED=Tue May 12 16:11:53 JST 2015 
END-OF-FILE 

現在我有一個相應的SQL Server表,我可以在其中輕鬆加載數據作爲目標。 因爲我是SSIS的新手,因此無法編寫腳本組件,以便可以過濾源文本文件並輕鬆加載到SQL Server表中。

在此先感謝!

回答

0

有幾種方法可以做到這一點。如果文件格式不變,則平面文件連接管理器編輯器會提供一些有用的屬性。例如,您可以將新的平面文件連接添加到連接管理器中。對於上述文件,有些屬性如「跳過行」,您可以將其設置爲18.然後它將從「|」列開始。

可能有用的平面文件連接管理器的另一個屬性是,如果打開平面文件連接管理器,然後單擊側面菜單中的列,則可以將列分隔符設置爲管道「|」

但是,如果文件的格式會改變,例如,可變數量的標題行,您可以使用腳本任務來刪除任何非管道行。例如頁眉和頁腳。

例如,您可以添加一個方法,例如file.readalllines,然後根據需要編輯或刪除這些行,然後保存該文件。

有關方法信息是在這裏: https://msdn.microsoft.com/en-us/library/s2tte0y1%28v=vs.110%29.aspx

例如除去最後一行的腳本任務

string[] lines = File.ReadAllLines("input.txt"); 
StringBuilder sb = new StringBuilder(); 
int count = lines.Length - 1; // all except last line 
for (int i = 0; i < count; i++) 
{ 
    sb.AppendLine(lines[i]); 
} 
File.WriteAllText("output.txt", sb.ToString()); 
0

使用以下VB腳本在SSIS的腳本組件任務源

enter code here 

Imports System 
Imports System.Data 
Imports System.Math 
Imports System.IO 
Imports Microsoft.SqlServer.Dts.Runtime 
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper 
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper 



<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()> _ 
<CLSCompliant(False)> _ 
Public Class ScriptMain 
    Inherits UserComponent 
    'Private strSourceDirectory As String 
    'Private strSourceFileName As String 
    Private strSourceSystem As String 
    Private strSourceSubSystem As String 
    Private dtBusinessDate As Date 


    Public Overrides Sub PreExecute() 
     MyBase.PreExecute() 
     ' 
     ' Add your code here for preprocessing or remove if not needed 
     '' 

    End Sub 

    Public Overrides Sub PostExecute() 
     MyBase.PostExecute() 
     ' 
     ' Add your code here for postprocessing or remove if not needed 
     ' You can set read/write variables here, for example: 
     Dim strSourceDirectory As String = Me.Variables.GLOBALSourceDirectory.ToString() 
     Dim strSourceFileName As String = Me.Variables.GLOBALSourceFileName.ToString() 
     'Dim strSourceSystem As String = Me.Variables.GLOBALSourceSystem.ToString() 
     'Dim strSourceSubSystem As String = Me.Variables.GLOBALSourceSubSystem.ToString() 
     'Dim dtBusinessDate As Date = Me.Variables.GLOBALBusinessDate.Date 


    End Sub 

    Public Overrides Sub CreateNewOutputRows() 
     ' 
     ' Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer". 
     ' For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput". 
     ' 
     Dim sr As System.IO.StreamReader 
     Dim strSourceDirectory As String = Me.Variables.GLOBALSourceDirectory.ToString() 
     Dim strSourceFileName As String = Me.Variables.GLOBALSourceFileName.ToString() 
     'Dim strSourceSystem As String = Me.Variables.GLOBALSourceSystem.ToString() 
     'Dim strSourceSubSystem As String = Me.Variables.GLOBALSourceSubSystem.ToString() 
     'Dim dtBusinessDate As Date = Me.Variables.GLOBALBusinessDate.Date 

     'sr = New System.IO.StreamReader("C:\QRM_SourceFiles\BBG_BONDS_OUTPUT_YYYYMMDD.txt") 
     sr = New System.IO.StreamReader(strSourceDirectory & strSourceFileName) 
     Dim lineIndex As Integer = 0 
     While (Not sr.EndOfStream) 
      Dim line As String = sr.ReadLine() 
      If (lineIndex <> 0) Then 'remove header row 
       Dim columnArray As String() = line.Split(Convert.ToChar("|")) 
       If (columnArray.Length > 1) Then 
        Output0Buffer.AddRow() 
        Output0Buffer.Col0 = columnArray(0).ToString() 
        Output0Buffer.Col3 = columnArray(3).ToString() 
        Output0Buffer.Col4 = columnArray(4).ToString() 
        Output0Buffer.Col5 = columnArray(5).ToString() 
        Output0Buffer.Col6 = columnArray(6).ToString() 
        Output0Buffer.Col7 = columnArray(7).ToString() 
        Output0Buffer.Col8 = columnArray(8).ToString() 
        Output0Buffer.Col9 = columnArray(9).ToString() 
        Output0Buffer.Col10 = columnArray(10).ToString() 
        Output0Buffer.Col11 = columnArray(11).ToString() 
        Output0Buffer.Col12 = columnArray(12).ToString() 
        Output0Buffer.Col13 = columnArray(13).ToString() 
        Output0Buffer.Col14 = columnArray(14).ToString() 
        Output0Buffer.Col15 = columnArray(15).ToString() 
        Output0Buffer.Col16 = columnArray(16).ToString() 
        Output0Buffer.Col17 = columnArray(17).ToString() 
        Output0Buffer.Col18 = columnArray(18).ToString() 
        Output0Buffer.Col19 = columnArray(19).ToString() 
        Output0Buffer.Col20 = columnArray(20).ToString() 
        Output0Buffer.Col21 = columnArray(21).ToString() 
        Output0Buffer.Col22 = columnArray(22).ToString() 
        Output0Buffer.Col23 = columnArray(23).ToString() 
        Output0Buffer.Col24 = columnArray(24).ToString() 

       End If 
      End If 
      lineIndex = lineIndex + 1 
     End While 
     sr.Close() 

    End Sub 

End Class 

代碼結束