BizTalk Patterns: Database Assisted Aggregation

Aggregation is a common pattern used in Enterprise Integration. System (or systems) A sends many messages that System B expects in a single message, or in several messages grouped on a particular attribute (or set of attributes).

The most common way to approach this in BizTalk is using a Sequential Convoy orchestration to aggregate the message – Microsoft provides a sample of how to do this in the SDK. This is a powerful pattern, but has a few downsides:

Sequential Convoys can become difficult to manage if they’re expected to run for a long time
Complex subgrouping can multiply the headaches here – for example, if you have to aggregate messages for hundreds of destinations concurrently
The destination message may become very large, to the point where BizTalk cannot optimally process it anymore – particularly if it is a large flat file message.
Modifying the aggregate message may be challenging in a standard BizTalk map, especially if one message might be expected to modify a previously aggregated message.

In this post, we’ll look at two approaches to overcoming these challenges through a database assisted aggregation pattern.

The advantages of this pattern are:

Avoid complexities of sequential convoys, especially sequential convoys that may have to run for long periods of time
Allow for complex subgrouping
More flexible data modification options throughout aggregation

There are two versions of this pattern that I’ve been using. One involves storing XML blobs in a document-database model, and one involves writing data down to a table throughout the day. The document model works well if there’s no need to modify data, data is not going to a flat file destination, and you need to capture only limited metadata about the document. The table model works well for flat file based output, especially for larger flat files that would be prohibitive to work with in BizTalk in a single shot.

Document Model

In this scenario, whenever a message comes in that needs to be aggregated, it’s mapped to a stored procedure that takes an XML parameter along with any other metadata fields about that message that will be captured on the row level. This might be as simple as a Transaction ID, OriginID, and a date that will be used for grouping/sorting. A stored procedure would look like this:

CREATE TABLE tAggregatorSample (TransactionID int,  OriginId, DT datetime, doc XML);
GO

CREATE PROCEDURE pInsertDoc
(
   @TransactionID int,
   @OriginID int,
   @dt DATETIME,
   @doc XML
)
AS
BEGIN

   INSERT tAggregatorSample (TransactionId, OriginId, DT, doc) VALUES (@TransactionID, @originId, @dt, @doc);

END

A map to this procedure would look fairly standard, but the “doc” node would take input from a scripting functoid to use an xsl:copy-of instruction to copy the desired portion of the XML document. This snippet would go in a scripting functoid using Inline XSLT, and a single output link to the doc node:

<doc xmlns='http://schemas.microsoft.com/Sql/2008/05/TypedProcedures/pInsertDoc'>
  <!-- put it in a CDATA node to prevent BizTalk from confusing content in the mapper -->
  <xsl:text disable-output-escaping="yes">&lt;![CDATA[</xsl:text>
    <xsl:copy-of select="/path/to/node/to/copy" />
  <xsl:text disable-output-escaping="yes">]]</xsl:text>
  <xsl:text disable-output-escaping="yes">&gt;</xsl:text>
</doc>

The select attribute on the copy-of could point to any node in the document; the root node works for the entire document, but since we’re aggregating we probably want a child node that will later be re-wrapped in the root node. So, using the following XML:

<ns0:Root xmlns:ns0="http://ns-for-schema">
   <ns0:TransactionId>1</ns0:TransactionId>
   <ns0:OriginId>123</ns0:OriginId>
   <ns0:TransactionDate>2015-10-10</ns0:TransactionDate>
   <ns0:Transaction><ns0:Child /><ns0:Child />...</ns0:Transaction>
<ns0:Root>

We’d probably want the path for the “Transaction” node.

On the outbound side of things, when ready (or a particular trigger message is received), a stored procedure like the following will extract the aggregated XML from the database:

CREATE PROCEDURE pExtractDoc
(
 @OriginId int,
 @dt DATETIME
)
AS
BEGIN

WITH XMLNAMESPACES('http://ns-for-schema' as ns1)
SELECT TOP 1
 TransactionId as 'ns0:TransactionId'
 ,OriginId as 'ns0:OriginId'
 ,dt as 'ns0:TransactionDate')
 ,(SELECT doc as '*' -- just get the raw XML as is
 FROM pAggregatorSample t2
 WHERE OriginId = @OriginId
 ORDER BY DT
 FOR XML PATH(''), TYPE) -- stuff the XML back into the parent
 FOR XML PATH('ns1:Root')
END

Invoking this as an XmlProcedure would give you back your aggregated document ready for consumption by BizTalk (for example, a send port filtering on such messages coming from this port).

Table Model

This model works well when large aggregation is done for a simple flat file. In this case, the scenario is a system that’s expecting a very large flat file, or perhaps needs previous records updated. Here, the inbound map and stored procedure is far more standard – map the relevant fields to the columns as desired, insert them, and you’re good. If the rows need to be modified, they can now be modified easily using standard SQL UPDATE statements.

On the outbound end, you have a couple options. If the file is going to a large CSV file, you can use the BCP utility to very quickly export it; I’ve tested this with up to 500MB worth of raw flat file data, and it finishes in about a minute (forget even trying to process a message like that in BizTalk – the XML version would be several gigabytes). On the other hand, if you’re dealing with smaller quantities of data, you can export the data back to BizTalk via a stored procedure (either StronglyTyped or XmlProcedure as desired), map to your destination schema, and be on your way. Even in this case, the outbound mapping becomes much simpler and avoids the need to reconstruct messages for every update (very expensive as the message grows in size).

Some data retention policy should be put into place here – issuing a TRUNCATE after the last aggregated message is sent, or deleting rows as they get polled for example. If there’s any desire to resubmit aggregated messages, the data could be stored for longer periods of time and resubmitted easily using the same process as the initial export.

BizTalk Patterns: Database Assisted Aggregation

Document Model

Table Model

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112