2016-11-11 46 views
1

我有多個分隔文本文件(例如.csv文件),每個文件都包含列,行和標題。將多個分隔文本文件導入到SQL Server數據庫並自動創建表格

我想盡可能地輕鬆地將所有這些輸入文件導入到SQL Server中。具體來說,我想創建輸出表,我將在其中導入這些文件動態

其中一些輸入文件需要導入到同一個輸出表中,而其他輸入文件需要導入到不同的表中。您可以假定將被導入到同一個表中的所有文件都具有相同的標題。

SQL Server Management Studio有一個導入嚮導它允許您導入分隔文本文件(和其他格式)並自動創建輸出表。但是,這不允許您同時導入多個文件。此外,它需要大量的手動工作,不可複製。

可以在線找到許多將多個文本文件導入表格的腳本。但是,其中大部分要求首先創建輸出表。這也需要每桌額外的工作。

是否有辦法列出所有相關的輸入文件及其相應的輸出表,以便自動創建表並隨後導入數據?

回答

2

此腳本允許您將多個分隔文本文件導入到SQL數據庫中。導入數據的表格(包括所有必需的列)將自動創建。該腳本包含一些文檔。

/* 
** This file was created by Laurens Bogaardt, Advisor Data Analytics at EY Amsterdam on 2016-11-03. 
** This script allows you to import multiple delimited text files into a SQL database. The tables 
** into which the data is imported, including all required columns, are created automatically. This 
** script uses tab-delimited (tsv) files and SQL Server Management Studio. The script may need some 
** minor adjustments for other formats and tools. The scripts makes several assumptions which need 
** to be valid before it can run properly. First of all, it assumes none of the output tables exist 
** in the SQL tool before starting. Therefore, it may be necessary to clean the database and delete 
** all the existing tables. Secondly, the script assumes that, if multiple text files are imported 
** into the same output table, the number and order of the columns of these files is identical. If 
** this is not the case, some manual work may need to be done to the text files before importing. 
** Finally, please note that this script only imports data as strings (to be precise, as NVARCHAR's 
** of length 255). It does not allow you to specify the datatype per column. This would need to be 
** done using another script after importing the data as strings. 
*/ 

-- 1. Import Multiple Delimited Text Files into a SQL Database 

-- 1.1 Define the path to the input and define the terminators 

/* 
** In this section, some initial parameters are set. Obviously, the 'DatabaseName' refers to the 
** database in which you want to create new tables. The '@Path' parameter sets the folder in 
** which the text files are located which you want to import. Delimited files are defined by 
** two characters: one which separates columns and one which separates rows. Usually, the 
** row-terminator is the newline character CHAR(10), also given by '\n'. When files are created 
** in Windows, the row-terminator often includes a carriage return CHAR(13), also given by '\r\n'. 
** Often, a tab is used to separate each column. This is given by CHAR(9) or by the character '\t'. 
** Other useful characters include the comma CHAR(44), the semi-colon CHAR(59) and the pipe 
** CHAR(124). 
*/ 

USE [DatabaseName] 
DECLARE @Path NVARCHAR(255) = 'C:\PathToFiles\' 
DECLARE @RowTerminator NVARCHAR(5) = CHAR(13) + CHAR(10) 
DECLARE @ColumnTerminator NVARCHAR(5) = CHAR(9) 

-- 1.2 Define the list of input and output in a temporary table 

/* 
** In this section, a temporary table is created which lists all the filenames of the delimited 
** files which need to be imported, as well as the names of the tables which are created and into 
** which the data is imported. Multiple files may be imported into the same output table. Each row 
** is prepended with an integer which increments up starting from 1. It is essential that this 
** number follows this logic. The temporary table is deleted at the end of this script. 
*/ 

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL 
DROP TABLE [dbo].[Files_Temporary]; 
CREATE TABLE [dbo].[Files_Temporary] 
(
    [ID] INT 
    , [FileName] NVARCHAR(255) 
    , [TableName] NVARCHAR(255) 
); 

INSERT INTO [dbo].[Files_Temporary] SELECT 1, 'MyFileA.txt', 'NewTable1' 
INSERT INTO [dbo].[Files_Temporary] SELECT 2, 'MyFileB.txt', 'NewTable2' 
INSERT INTO [dbo].[Files_Temporary] SELECT 3, 'MyFileC.tsv', 'NewTable2' 
INSERT INTO [dbo].[Files_Temporary] SELECT 4, 'MyFileD.csv', 'NewTable2' 
INSERT INTO [dbo].[Files_Temporary] SELECT 5, 'MyFileE.dat', 'NewTable2' 
INSERT INTO [dbo].[Files_Temporary] SELECT 6, 'MyFileF',  'NewTable3' 
INSERT INTO [dbo].[Files_Temporary] SELECT 7, 'MyFileG.text', 'NewTable4' 
INSERT INTO [dbo].[Files_Temporary] SELECT 8, 'MyFileH.txt', 'NewTable5' 
INSERT INTO [dbo].[Files_Temporary] SELECT 9, 'MyFileI.txt', 'NewTable5' 
INSERT INTO [dbo].[Files_Temporary] SELECT 10, 'MyFileJ.txt', 'NewTable5' 
INSERT INTO [dbo].[Files_Temporary] SELECT 11, 'MyFileK.txt', 'NewTable6' 

-- 1.3 Loop over the list of input and output and import each file to the correct table 

/* 
** In this section, the 'WHILE' statement is used to loop over all input files. A counter is defined 
** which starts at '1' and increments with each iteration. The filename and tablename are retrieved 
** from the previously defined temporary table. The next step of the script is to check whether the 
** output table already exists or not. 
*/ 

DECLARE @Counter INT = 1 

WHILE @Counter <= (SELECT COUNT(*) FROM [dbo].[Files_Temporary]) 
BEGIN 
    PRINT 'Counter is ''' + CONVERT(NVARCHAR(5), @Counter) + '''.' 

    DECLARE @FileName NVARCHAR(255) 
    DECLARE @TableName NVARCHAR(255) 
    DECLARE @Header NVARCHAR(MAX) 
    DECLARE @SQL_Header NVARCHAR(MAX) 
    DECLARE @CreateHeader NVARCHAR(MAX) = '' 
    DECLARE @SQL_CreateHeader NVARCHAR(MAX) 

    SELECT @FileName = [FileName], @TableName = [TableName] FROM [dbo].[Files_Temporary] WHERE [ID] = @Counter 

    IF OBJECT_ID('[dbo].[' + @TableName + ']', 'U') IS NULL 
    BEGIN 
/* 
** If the output table does not yet exist, it needs to be created. This requires the list of all 
** columnnames for that table to be retrieved from the first line of the text file, which includes 
** the header. A piece of SQL code is generated and executed which imports the header of the text 
** file. A second temporary table is created which stores this header as a single string. 
*/ 
     PRINT 'Creating new table with name ''' + @TableName + '''.' 

     IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL 
     DROP TABLE [dbo].[Header_Temporary]; 
     CREATE TABLE [dbo].[Header_Temporary] 
     (
      [Header] NVARCHAR(MAX) 
     ); 

     SET @SQL_Header = ' 
      BULK INSERT [dbo].[Header_Temporary] 
      FROM ''' + @Path + @FileName + ''' 
      WITH 
      (
       FIRSTROW = 1, 
       LASTROW = 1, 
       MAXERRORS = 0, 
       FIELDTERMINATOR = ''' + @RowTerminator + ''', 
       ROWTERMINATOR = ''' + @RowTerminator + ''' 
      )' 
     EXEC(@SQL_Header) 

     SET @Header = (SELECT TOP 1 [Header] FROM [dbo].[Header_Temporary]) 
     PRINT 'Extracted header ''' + @Header + ''' for table ''' + @TableName + '''.' 
/* 
** The columnnames in the header are separated using the column-terminator. This can be used to loop 
** over each columnname. A new piece of SQL code is generated which will create the output table 
** with the correctly named columns. 
*/ 
     WHILE CHARINDEX(@ColumnTerminator, @Header) > 0 
     BEGIN   
      SET @CreateHeader = @CreateHeader + '[' + LTRIM(RTRIM(SUBSTRING(@Header, 1, CHARINDEX(@ColumnTerminator, @Header) - 1))) + '] NVARCHAR(255), ' 
      SET @Header = SUBSTRING(@Header, CHARINDEX(@ColumnTerminator, @Header) + 1, LEN(@Header)) 
     END 
     SET @CreateHeader = @CreateHeader + '[' + @Header + '] NVARCHAR(255)' 

     SET @SQL_CreateHeader = 'CREATE TABLE [' + @TableName + '] (' + @CreateHeader + ')' 
     EXEC(@SQL_CreateHeader) 
    END 

/* 
** Finally, the data from the text file is imported into the newly created table. The first line, 
** including the header information, is skipped. If multiple text files are imported into the same 
** output table, it is essential that the number and the order of the columns is identical, as the 
** table will only be created once, using the header information of the first text file. 
*/ 
    PRINT 'Inserting data from ''' + @FileName + ''' to ''' + @TableName + '''.' 
    DECLARE @SQL NVARCHAR(MAX) 
    SET @SQL = ' 
     BULK INSERT [dbo].[' + @TableName + '] 
     FROM ''' + @Path + @FileName + ''' 
     WITH 
     (
      FIRSTROW = 2, 
      MAXERRORS = 0, 
      FIELDTERMINATOR = ''' + @ColumnTerminator + ''', 
      ROWTERMINATOR = ''' + @RowTerminator + ''' 
     )' 
    EXEC(@SQL) 

    SET @Counter = @Counter + 1 
END; 

-- 1.4 Cleanup temporary tables 

/* 
** In this section, the temporary tables which were created and used by this script are deleted. 
** Alternatively, the script could have used 'real' temporary table (identified by the '#' character 
** in front of the name) or a table variable. These would have deleted themselves once they were no 
** longer in use. However, the end result is the same. 
*/ 

IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL 
DROP TABLE [dbo].[Files_Temporary]; 

IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL 
DROP TABLE [dbo].[Header_Temporary]; 
+0

這是一個寫得很好的答案。我建議的一個小調整是,而不是手動插入文件和表名稱,下面的語句可以用來自動填充表格, ------插入文件名稱------ insert into Files_Temporary(filename)exec master..xp_cmdshell'dir <<文件所在的文件夾路徑>>/b/ad' ------更新表名,消除文件擴展名------- update Files_Temporary set [TableName] = SUBSTRING(filename,0,CHARINDEX('。',filename)) – Chendur

0

如果我是你,我會創建一個小的VBA腳本的所有TXT文件轉換文件夾以XLS文件,像你描述的那麼每一個加載到SQL Server表。

select * 
into SQLServerTable FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0', 
    'Excel 8.0;Database=C:\your_path_here\test.xls;HDR=YES', 
    'SELECT * FROM [Sheet1$]') 

查看詳情。

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=49926

至於TXT文件轉換成XLS文件的過程,試試這個。

Private Declare Function SetCurrentDirectoryA Lib _ 
     "kernel32" (ByVal lpPathName As String) As Long 

Public Function ChDirNet(szPath As String) As Boolean 
'based on Rob Bovey's code 
    Dim lReturn As Long 
    lReturn = SetCurrentDirectoryA(szPath) 
    ChDirNet = CBool(lReturn <> 0) 
End Function 

Sub Get_TXT_Files() 
'For Excel 2000 and higher 
    Dim Fnum As Long 
    Dim mysheet As Worksheet 
    Dim basebook As Workbook 
    Dim TxtFileNames As Variant 
    Dim QTable As QueryTable 
    Dim SaveDriveDir As String 
    Dim ExistFolder As Boolean 

    'Save the current dir 
    SaveDriveDir = CurDir 

    'You can change the start folder if you want for 
    'GetOpenFilename,you can use a network or local folder. 
    'For example ChDirNet("C:\Users\Ron\test") 
    'It now use Excel's Default File Path 

    ExistFolder = ChDirNet("C:\your_path_here\Text\") 
    If ExistFolder = False Then 
     MsgBox "Error changing folder" 
     Exit Sub 
    End If 

    TxtFileNames = Application.GetOpenFilename _ 
    (filefilter:="TXT Files (*.txt), *.txt", MultiSelect:=True) 

    If IsArray(TxtFileNames) Then 

     On Error GoTo CleanUp 

     With Application 
      .ScreenUpdating = False 
      .EnableEvents = False 
     End With 

     'Add workbook with one sheet 
     Set basebook = Workbooks.Add(xlWBATWorksheet) 

     'Loop through the array with txt files 
     For Fnum = LBound(TxtFileNames) To UBound(TxtFileNames) 

      'Add a new worksheet for the name of the txt file 
      Set mysheet = Worksheets.Add(After:=basebook. _ 
           Sheets(basebook.Sheets.Count)) 
      On Error Resume Next 
      mysheet.Name = Right(TxtFileNames(Fnum), Len(TxtFileNames(Fnum)) - _ 
            InStrRev(TxtFileNames(Fnum), "\", , 1)) 
      On Error GoTo 0 

      With ActiveSheet.QueryTables.Add(Connection:= _ 
         "TEXT;" & TxtFileNames(Fnum), Destination:=Range("A1")) 
       .TextFilePlatform = xlWindows 
       .TextFileStartRow = 1 

       'This example use xlDelimited 
       'See a example for xlFixedWidth below the macro 
       .TextFileParseType = xlDelimited 

       'Set your Delimiter to true 
       .TextFileTabDelimiter = True 
       .TextFileSemicolonDelimiter = False 
       .TextFileCommaDelimiter = False 
       .TextFileSpaceDelimiter = False 

       'Set the format for each column if you want (Default = General) 
       'For example Array(1, 9, 1) to skip the second column 
       .TextFileColumnDataTypes = Array(1, 9, 1) 

       'xlGeneralFormat General   1 
       'xlTextFormat  Text    2 
       'xlMDYFormat  Month-Day-Year 3 
       'xlDMYFormat  Day-Month-Year 4 
       'xlYMDFormat  Year-Month-Day 5 
       'xlMYDFormat  Month-Year-Day 6 
       'xlDYMFormat  Day-Year-Month 7 
       'xlYDMFormat  Year-Day-Month 8 
       'xlSkipColumn  Skip    9 

       ' Get the data from the txt file 
       .Refresh BackgroundQuery:=False 
      End With 
     ActiveSheet.QueryTables(1).Delete 
     Next Fnum 

     'Delete the first sheet of basebook 
     On Error Resume Next 
     Application.DisplayAlerts = False 
     basebook.Worksheets(1).Delete 
     Application.DisplayAlerts = True 
     On Error GoTo 0 

CleanUp: 

     ChDirNet SaveDriveDir 

     With Application 
      .ScreenUpdating = True 
      .EnableEvents = True 
     End With 
    End If 
End Sub 

您可以設置Windows調度程序,根據需要自動運行該過程。

相關問題