此腳本允許您將多個分隔文本文件導入到SQL數據庫中。導入數據的表格(包括所有必需的列)將自動創建。該腳本包含一些文檔。
/*
** This file was created by Laurens Bogaardt, Advisor Data Analytics at EY Amsterdam on 2016-11-03.
** This script allows you to import multiple delimited text files into a SQL database. The tables
** into which the data is imported, including all required columns, are created automatically. This
** script uses tab-delimited (tsv) files and SQL Server Management Studio. The script may need some
** minor adjustments for other formats and tools. The scripts makes several assumptions which need
** to be valid before it can run properly. First of all, it assumes none of the output tables exist
** in the SQL tool before starting. Therefore, it may be necessary to clean the database and delete
** all the existing tables. Secondly, the script assumes that, if multiple text files are imported
** into the same output table, the number and order of the columns of these files is identical. If
** this is not the case, some manual work may need to be done to the text files before importing.
** Finally, please note that this script only imports data as strings (to be precise, as NVARCHAR's
** of length 255). It does not allow you to specify the datatype per column. This would need to be
** done using another script after importing the data as strings.
*/
-- 1. Import Multiple Delimited Text Files into a SQL Database
-- 1.1 Define the path to the input and define the terminators
/*
** In this section, some initial parameters are set. Obviously, the 'DatabaseName' refers to the
** database in which you want to create new tables. The '@Path' parameter sets the folder in
** which the text files are located which you want to import. Delimited files are defined by
** two characters: one which separates columns and one which separates rows. Usually, the
** row-terminator is the newline character CHAR(10), also given by '\n'. When files are created
** in Windows, the row-terminator often includes a carriage return CHAR(13), also given by '\r\n'.
** Often, a tab is used to separate each column. This is given by CHAR(9) or by the character '\t'.
** Other useful characters include the comma CHAR(44), the semi-colon CHAR(59) and the pipe
** CHAR(124).
*/
USE [DatabaseName]
DECLARE @Path NVARCHAR(255) = 'C:\PathToFiles\'
DECLARE @RowTerminator NVARCHAR(5) = CHAR(13) + CHAR(10)
DECLARE @ColumnTerminator NVARCHAR(5) = CHAR(9)
-- 1.2 Define the list of input and output in a temporary table
/*
** In this section, a temporary table is created which lists all the filenames of the delimited
** files which need to be imported, as well as the names of the tables which are created and into
** which the data is imported. Multiple files may be imported into the same output table. Each row
** is prepended with an integer which increments up starting from 1. It is essential that this
** number follows this logic. The temporary table is deleted at the end of this script.
*/
IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
CREATE TABLE [dbo].[Files_Temporary]
(
[ID] INT
, [FileName] NVARCHAR(255)
, [TableName] NVARCHAR(255)
);
INSERT INTO [dbo].[Files_Temporary] SELECT 1, 'MyFileA.txt', 'NewTable1'
INSERT INTO [dbo].[Files_Temporary] SELECT 2, 'MyFileB.txt', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 3, 'MyFileC.tsv', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 4, 'MyFileD.csv', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 5, 'MyFileE.dat', 'NewTable2'
INSERT INTO [dbo].[Files_Temporary] SELECT 6, 'MyFileF', 'NewTable3'
INSERT INTO [dbo].[Files_Temporary] SELECT 7, 'MyFileG.text', 'NewTable4'
INSERT INTO [dbo].[Files_Temporary] SELECT 8, 'MyFileH.txt', 'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 9, 'MyFileI.txt', 'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 10, 'MyFileJ.txt', 'NewTable5'
INSERT INTO [dbo].[Files_Temporary] SELECT 11, 'MyFileK.txt', 'NewTable6'
-- 1.3 Loop over the list of input and output and import each file to the correct table
/*
** In this section, the 'WHILE' statement is used to loop over all input files. A counter is defined
** which starts at '1' and increments with each iteration. The filename and tablename are retrieved
** from the previously defined temporary table. The next step of the script is to check whether the
** output table already exists or not.
*/
DECLARE @Counter INT = 1
WHILE @Counter <= (SELECT COUNT(*) FROM [dbo].[Files_Temporary])
BEGIN
PRINT 'Counter is ''' + CONVERT(NVARCHAR(5), @Counter) + '''.'
DECLARE @FileName NVARCHAR(255)
DECLARE @TableName NVARCHAR(255)
DECLARE @Header NVARCHAR(MAX)
DECLARE @SQL_Header NVARCHAR(MAX)
DECLARE @CreateHeader NVARCHAR(MAX) = ''
DECLARE @SQL_CreateHeader NVARCHAR(MAX)
SELECT @FileName = [FileName], @TableName = [TableName] FROM [dbo].[Files_Temporary] WHERE [ID] = @Counter
IF OBJECT_ID('[dbo].[' + @TableName + ']', 'U') IS NULL
BEGIN
/*
** If the output table does not yet exist, it needs to be created. This requires the list of all
** columnnames for that table to be retrieved from the first line of the text file, which includes
** the header. A piece of SQL code is generated and executed which imports the header of the text
** file. A second temporary table is created which stores this header as a single string.
*/
PRINT 'Creating new table with name ''' + @TableName + '''.'
IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];
CREATE TABLE [dbo].[Header_Temporary]
(
[Header] NVARCHAR(MAX)
);
SET @SQL_Header = '
BULK INSERT [dbo].[Header_Temporary]
FROM ''' + @Path + @FileName + '''
WITH
(
FIRSTROW = 1,
LASTROW = 1,
MAXERRORS = 0,
FIELDTERMINATOR = ''' + @RowTerminator + ''',
ROWTERMINATOR = ''' + @RowTerminator + '''
)'
EXEC(@SQL_Header)
SET @Header = (SELECT TOP 1 [Header] FROM [dbo].[Header_Temporary])
PRINT 'Extracted header ''' + @Header + ''' for table ''' + @TableName + '''.'
/*
** The columnnames in the header are separated using the column-terminator. This can be used to loop
** over each columnname. A new piece of SQL code is generated which will create the output table
** with the correctly named columns.
*/
WHILE CHARINDEX(@ColumnTerminator, @Header) > 0
BEGIN
SET @CreateHeader = @CreateHeader + '[' + LTRIM(RTRIM(SUBSTRING(@Header, 1, CHARINDEX(@ColumnTerminator, @Header) - 1))) + '] NVARCHAR(255), '
SET @Header = SUBSTRING(@Header, CHARINDEX(@ColumnTerminator, @Header) + 1, LEN(@Header))
END
SET @CreateHeader = @CreateHeader + '[' + @Header + '] NVARCHAR(255)'
SET @SQL_CreateHeader = 'CREATE TABLE [' + @TableName + '] (' + @CreateHeader + ')'
EXEC(@SQL_CreateHeader)
END
/*
** Finally, the data from the text file is imported into the newly created table. The first line,
** including the header information, is skipped. If multiple text files are imported into the same
** output table, it is essential that the number and the order of the columns is identical, as the
** table will only be created once, using the header information of the first text file.
*/
PRINT 'Inserting data from ''' + @FileName + ''' to ''' + @TableName + '''.'
DECLARE @SQL NVARCHAR(MAX)
SET @SQL = '
BULK INSERT [dbo].[' + @TableName + ']
FROM ''' + @Path + @FileName + '''
WITH
(
FIRSTROW = 2,
MAXERRORS = 0,
FIELDTERMINATOR = ''' + @ColumnTerminator + ''',
ROWTERMINATOR = ''' + @RowTerminator + '''
)'
EXEC(@SQL)
SET @Counter = @Counter + 1
END;
-- 1.4 Cleanup temporary tables
/*
** In this section, the temporary tables which were created and used by this script are deleted.
** Alternatively, the script could have used 'real' temporary table (identified by the '#' character
** in front of the name) or a table variable. These would have deleted themselves once they were no
** longer in use. However, the end result is the same.
*/
IF OBJECT_ID('[dbo].[Files_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Files_Temporary];
IF OBJECT_ID('[dbo].[Header_Temporary]', 'U') IS NOT NULL
DROP TABLE [dbo].[Header_Temporary];
這是一個寫得很好的答案。我建議的一個小調整是,而不是手動插入文件和表名稱,下面的語句可以用來自動填充表格, ------插入文件名稱------ insert into Files_Temporary(filename)exec master..xp_cmdshell'dir <<文件所在的文件夾路徑>>/b/ad' ------更新表名,消除文件擴展名------- update Files_Temporary set [TableName] = SUBSTRING(filename,0,CHARINDEX('。',filename)) – Chendur