Bulk Insert into SQL Server using SqlBulkCopy
I was
recently tasked with a project at a company to update an SQL Server 2008
database with large amounts of data each day. The task at first seemed
daunting due to the files exceeding well over 400,000 records and there were
several that needed processing daily. I first tried LINQ to SQL, but with
the amount of data, the inserts were slow performing to say the
least. Then I remembered the SqlBulkCopy class. SqlBulkCopy lets you efficiently bulk load a
SQL Server table with data from another source. The SqlBulkCopy class can be used to write data
only to SQL Server tables. However, the data source is not limited to SQL
Server; any data source can be used, as long as the data can be loaded to a
DataTable instance or read with a IDataReader instance. For this
example the file will contain roughly 1000 records, but this code can handle
large amounts of data.
To begin
with let’s create a table in SQL Server that will hold the data. Copy the
following T-SQL into SQL Server to create your table:
CREATE
TABLE [dbo].[Censis](
[Suburb] [varchar](200) NULL,
[NotStated] [int] NULL,
[NotApplicable] [int] NULL,
[Fishing] [int] NULL,
[Mining] [int] NULL,
[Manufacturing] [int] NULL,
[Electricity] [int] NULL,
[Construction] [int] NULL
) ON
[PRIMARY]
GO
The table
above will hold Censis data that is freely available to download in Australia .
The next
item to do is create a console application that will bulk load the
data. Open Visual Studio 2008 and choose File > New > Windows >
Console Application.
Before
moving on, to explain the code I have to work backwards and explain the final
method that bulk loads data. SqlBulkCopy has a method called WriteToServer. One
of the overloads of this method takes a DataTable as the parameter. Because a DataTable contains rows and columns, this
seemed like a logical choice for the task I was facing.
Jumping
back to the example we now know we need to create a DataTable that contains the information
from the text file. The code below demonstrates how to do this:
C#
DataTable
dt = new DataTable();
string
line = null;
int
i = 0;
using
(StreamReader sr = File.OpenText(@"c:\temp\table1.csv"))
{
while ((line = sr.ReadLine()) != null)
{
string[] data =
line.Split(',');
if (data.Length > 0)
{
if (i == 0)
{
foreach (var
item in data)
{
dt.Columns.Add(new DataColumn());
}
i++;
}
DataRow row = dt.NewRow();
row.ItemArray = data;
dt.Rows.Add(row);
}
}
}
VB.NET
Dim
dt As New
DataTable()
Dim
line As String
= Nothing
Dim
i As Integer =
0
Using
sr As StreamReader =
File.OpenText("c:\temp\table1.csv")
line = sr.ReadLine()
Do While line IsNot Nothing
Dim data() As
String = line.Split(","c)
If data.Length > 0 Then
If i = 0 Then
For Each
item In data
dt.Columns.Add(New DataColumn())
Next item
i += 1
End If
Dim row As
DataRow = dt.NewRow()
row.ItemArray = data
dt.Rows.Add(row)
End If
line = sr.ReadLine()
End
Using
In the
code above, I created a DataTable
that will store all the information from the csv file. The csv file
resides in the C:\Temp directory. I am using a StreamReader object to open the file and read
each line in the file. Each line is then split up into a string array. That
string array will be assigned to each DataRow as the ItemArray value. This sets the values
for the row through the array.
When the
file has been read, the next thing to do is use the SqlBulkCopy class to insert the data into SQL
Server. The following code demonstrates how to do this:
C#
using
(SqlConnection cn = new
SqlConnection(ConfigurationManager.ConnectionStrings["ConsoleApplication3.Properties.Settings.daasConnectionString"].ConnectionString))
{
cn.Open();
using (SqlBulkCopy
copy = new SqlBulkCopy(cn))
{
copy.ColumnMappings.Add(0, 0);
copy.ColumnMappings.Add(1, 1);
copy.ColumnMappings.Add(2, 2);
copy.ColumnMappings.Add(3, 3);
copy.ColumnMappings.Add(4, 4);
copy.DestinationTableName = "Censis";
copy.WriteToServer(dt);
}
}
VB.NET
Using
cn As New
SqlConnection(ConfigurationManager.ConnectionStrings("ConsoleApplication3.Properties.Settings.daasConnectionString").ConnectionString)
cn.Open()
Using copy As
New SqlBulkCopy(cn)
copy.ColumnMappings.Add(0, 0)
copy.ColumnMappings.Add(1, 1)
copy.ColumnMappings.Add(2, 2)
copy.ColumnMappings.Add(3, 3)
copy.ColumnMappings.Add(4, 4)
copy.DestinationTableName = "Censis"
copy.WriteToServer(dt)
End Using
End
Using
SqlBulkCopy uses ADO.NET to connect to a database to bulk load the data. I
have created an SqlConnection object, and that object reference is used to create the SqlBulkCopy object. The DestinationTableName property references a table in
the database where the data is to be loaded. A handy feature of SqlBulkCopy is the
SqlBulkCopyColumnMappingCollection. Column mappings define the
relationships between columns in the data source and columns in the
destination. This is handy if the data source file has columns that don’t
need to be inserted into the database. Column mappings can be set by an
index, such as the example above, or they can be set by the name of the
column. Using the index is handy when you’re working with files that
contain no column names. Finally the data is sent to the database by
running the WriteToServer method.
No comments:
Post a Comment