Improving performance of inserting multiple parent/child tables in a single SQL Procedure

Tom Babiec wrote a great blog a few months back on inserting multiple parent child tables in a single stored procedure. We use this technique a lot in our data integration work, and it’s proven to be very robust in many contexts. The SQL procedure outlined in that blog is useful not just for BizTalk, but generally speaking for ADO.NET and other applications trying to load multiple rows into SQL Server for multiple parent child tables. However, when dealing with larger datasets (and as the table grows), we’ve noticed some degradation in performance. In some cases, we were seeing variances of 30 seconds to 10+ minutes for loading the same exact data set on the same database. We tried a few different options, including forcing a recompile of the stored procedure between each load

WITH RECOMPILE

, but this did not solve the problem. Finally, taking a look at the query plan gave some additional insight for a fairly simple fix.

By default, the query optimizer is using the Nested Loops operator to do the final MERGE between the input table and the tracker table:

Note that the table scan can be avoided by ensuring there’s a proper primary key on the source table – but this table scan was of little consequence in our testing, as it’s scanning an input table variable either way.

This works well when at least one of the datasets is small (e.g. 10 rows or less), and is the most performant join operator in that case. However, when either the source tables get larger or the destination tables get larger, we’ve noticed the query optimizer fails to change to using Merge Join for that final merge operation, which is the preferable operator when dealing with two larger datasets. We were occasionally able to get this to happen by changing around indexes on the table variables, but we’d consistently find the query plan going back to Nested Loops. There are a few possible reasons for this:

The query optimizer assumes that the table variables coming in to the procedure will normally be small. This is probably usually a fair assumption.
Merge Join is more expensive when one of the result sets is in fact small, so Nested Loops will be preferable for most cases.
It may become particularly unpredictable when joining the result set within the USING portion of the merge statement – the optimizer may incorrectly assume this is going to result in fewer rows.
Merge Join will require two additional sort operations – however, this sort will be cheap since the sort is actually happening in relation to existing clustered indexes (so the data is already sorted)

Whatever the final reasons, testing showed that Merge Joins were vastly better here. There were two attempts we had; first, we tried just using the MERGE hint for the source table, but this would result in Nested Loops on the destination table; finally, we ended up an OPTION (MERGE JOIN); for the whole merge statement, forcing the engine to use Merge Join over Nested Loops to the final table and the intermediate source table:

MERGE TargetTableName target
USING (SELECT * FROM @sourceTable src INNER JOIN @trackerTable tracker ON src.Pk = tracker.Pk)  as remapTable
ON 1=0 -- force insert
WHEN NOT MATCHED
THEN
INSERT (col1, col2, col3, etc...)
VALUES (col1, col2, col3, etc...)
OUTPUT INSERTED.pk_id, remapTable.old_id into @nextTrackerTable
OPTION (MERGE JOIN);

Which results in a query plan like this:

Using the OPTION (MERGE JOIN); does the trick here, and renders the more favorable query plan for these scenarios. Using this technique showed no noticeable change in smaller loads (where most input tables had fewer than 10 rows); but for larger loads (even up to hundreds of thousands of rows in children), we saw data loads go from 10+ minutes down to >1 minute.

Improving performance of inserting multiple parent/child tables in a single SQL Procedure

Trending Articles

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Ifield Avenue closed following crash in Langley Green

Practice Sheet of Right form of verbs for HSC Students

Skint TV teen to be sentenced

Moondru Mudichu 09-08-2017 – Polimer tv Serial

Parris out on $9,000 bail

TASK ERROR: storage migration failed: block job (mirror) error:...

Electronic Bank Statement field Assignment (ZUONR) missing alphabets from...

गर्मी पर स्टेटस – Funny Summer Status in Hindi for Whatsapp

PSM I question: Product Backlog item considered complete

Karimnagar District Police Office Mobile Numbers List in Telangana State

Griffith faces three more offences

NCERT Solutions for Class 9th Sanskrit Chapter 2 अविवेकः परमापदां पदम्

Shatta Wale – You Shock Me (Prod. by Willis Beatz)

09g927750** 6 speed transmission TCM VAG original firmware files

The 10 Wyoming Cities With The Largest Black Population For 2021

More things we have to put up with: when NOT to raise hell with Disclosure

Karnataka SSLC 10th Exam Time Table 2016 (www.kseeb.kar.nic.in)

Scripting Tracker - Development Tool for SAP GUI Scripting

newbie need guide - help - read flash xc2287-96F with dap miniwiggler