The Problem

You want to insert 5000 records into the database. You normally do this by creating a separate insert statement for each record, and this is fine with your boss, but you’re a glutton for punishment and want to optimize this scenario because it occurs often enough in your application. On the other hand, you’re also lazy, have ten thousand interesting things to do, and would love to not have to spend a lot of time tackling the problem. Oh, and you’ll be using stored procedures because, as everyone knows, they’re better than dynamic SQL. So? What do you do?

Enter OPENXML

OPENXML is a nice little function that will allow us to solve the above-mentioned problem without feeling like we’re hacking our way through it. The need for a hack wouldn’t arise if we can just send any number of parameters that we want to the stored procedure; i.e., in .NET, we have the params[] array, and in Javascript, we have the args array, but in stored procedures, we don’t have any such luxury.

The code, then, of the stored procedure that bulk-inserts data into a table using OPENXML:

  1. CREATE procedure [dbo].[openxml_test]
  2.  
  3. @xmlDocument text
  4.  
  5. as
  6.  
  7. declare @handle int
  8.  
  9. exec sp_xml_preparedocument @handle output, @xmlDocument
  10.  
  11. begin
  12.  
  13. set nocount on;
  14.  
  15. insert into openxml_test_db(ID, random_data)
  16.  
  17.      select tid, tdata from openxml(@handle, N‘ROOT/Thread’, 2)
  18.  
  19.      with (tid uniqueidentifier, tdata varchar(100))
  20.  
  21. exec sp_xml_removedocument @handle
  22.  
  23. end
  24.  
  25.  
  26.  

The XML structure is going to be of the form:

  1. <ROOT>
  2.  
  3. <Thread>
  4.  
  5. <TGuid>cf69e831-4b82-43bc-bc0c-00013af696d3</TGuid>
  6.  
  7. <TData>An awful lot of data</TData>
  8.  
  9. </Thread>
  10.  
  11. </ROOT>

What We’re Doing

Line 3: Sending the variable as text is less effecient than sending it as varchar(8000). If you’re reasonably sure that your xml (including the tags, data, and all) that you send to the stored procedure isn’t going to exceed 8000 characters in length, I suggest you use that instead.

Line 9: sp_xml_preparedocument is a system stored procedure that, as its name implies, prepares the xml document. This stored procedure needs to be called before a call to openxml.

Line 17: We’re getting the data from the xml document and inserting it into the table. The @handle is the handle returned by sp_xml_preparedocument, the‘ROOT/Thread’ is the XPATH that tells OPENXML which rows to process, and the 2 describes the mapping between OPENXML’s rowset and the XML Document. This last parameter is optional. The default mapping is attribute-centric mapping, which we would use if our XML structure was like the following:

  1. …
  2. <TGuid value=”The guid goes here” />
  3. <TData value=”An awful lot of data” />
  4. …

Line 19: We’re basically telling how to format the data we get from OPENXML.

Line 21: The XML Document that SQL Server created is stored in memory. To avoid memory leaks, we close the handle.

So, What Kind of Savings Can We Expect?

Comparison of a normal insert vs. a bulk-insert

Two things to notice:
  1. The savings become more and more significant as the number of records we’re inserting becomes larger.
  2. Even when we’re inserting only 50 records, the difference is noticeable (about .34 seconds).