Try OpenEdge Now
skip to main content
SQL Development
Change Data Capture : Use cases : Propagating change details to the Data Warehouse, and using CDC_get_changed_columns()
 

Propagating change details to the Data Warehouse, and using CDC_get_changed_columns()

This Use Case is applicable when a Data Warehouse (DW) table mirrors, to a significant degree, the source table S, and contains some subset of the columns in S. In this case, the change values to send to DW are driven by detailed examination of each Change table record in a transaction. This use case also shows how the SQL CDC function, CDC_get_changed_columns() can be used to understand and process change Data.
The change data values are accessed in 2 steps
1. To get the source table number
2. To get the changes from the Change table.
Step 1 is the same as the example in Getting change data values for a source table.
For example, if the Change table contains 12 fields being tracked from the source table, with names, in addition to some key Identifying fields:
*fld1, fld2, …, fldn
*ID_f1, ID_f2, ID_f3 – identifying fields.
Step 2 gets Change data value from the Change table, in the change sequence order. Changes must be send to the Data Warehouse in the change sequence order so that the Data Warehouse data reflects the corresponding rows in the source table with accuracy.
The following query uses the SQL CDC function, CDC_get_changed_columns(), to get a list of the columns changed on an UPDATE operation. INSERT and DELETE operations always have Change data for every change-tracked field, as well as for Identifying Fields.
select fld1, fld2, fld3, fld4, fld5, fld6, fld7, fld9, fld10, fld11, fld12,
ID_f1, ID_f2, ID_f3,
"_Operation",
CDC_get_changed_columns (pub.CDC_Customer, _Change-FieldMap) as gcc
from pub."_Cdc-Change-Tracking" ct
inner join
pub.CDC_Customer c -- assumed Change table name.
on ct."_Change-Sequence" = c."_Change-Sequence"
where ct."_ Source-Table-Number" = ? --from 1st query.
and c."_Operation" != 3 -- optional - exclude Before Update images.

order by ct."_ Source-Table-Number", ct."_Change-Sequence";
-- Order By keys map to index _Sequence-Id
*Sometimes the data sent to DW contains a small amount of inconsistency due to incomplete transactions. No matter how Change data is selected, time-bounded or not, it is always possible that the selected changes contain incomplete transactions.
*Transaction completion is not written to the Change data.
*As all Change table data is committed, data values are always valid, even if inconsistent with respect to transactions.
*This query retrieves all the tracked fields in the Change table. The fields that do not contain changes, and do not contain Identifying fields, will have the Null value in the Result Set.
*The application learns the names of the columns with changes from an UPDATE from the value of the function CDC_get_changed_columns().
*Insert and Delete operations write all change-tracked column values to the Change table.
*Some ETL applications might want to get only change for certain operations, such as INSERT or DELETE. This can be easily accomplished by adding a Where clause predicate which chooses the desired values of “_Operation” in the Change table.
This query could be embedded within application logic as shown in the following example:
do
query #2 -- this is query above, in Step #2.
exitloop if no data
Let RS = result set from query #1
Let stmt =
switch _Operation
case Create – Build Insert stmt for fields 1..12,
plus Identifying fields,
with values from parameters
Set parameter values for Insert from RS data
case Delete – Build Delete stmt with
Identifying fields = parameters
Set parameter values from RS ID_f* fields
case Update – Build partial Update for DW table
for each field f in gcc: -- changed columns
Build set f = parameter, remember parameter
Build Where clause for ID_F* fields = parameters
Set parameter values for “set” from RS data
Set parameter values from RS ID_f* fields
end switch


exec sql Prepare of generate stmt
-- set all parameters for generated stmt
switch _Operation
case Create – Set parameter values for Insert from RS data
case Delete – Set parameter values from RS ID_f* fields
case Update – Set parameter values for “set” from RS data
Set parameter values from RS ID_f* fields
end switch


-- Now execute prepared stmt to send changes to DW
execute generated stmt

enddo
The UPDATE statement that is generated for a particular result set row might look like the following example:
UPDATE pub.my_target_tbl
SET fld1 = ?,
fld2 = ?
WHERE ID_f1 = ? AND ID_f2 = ? and ID_f3 = ?