Tracking Data Changes: With Temporal Tables and More
Tracking Data Changes: With Temporal Tables and More
Possible Whys
• Audit trail - who changed what when
• Analysis - maintaining a data warehouse
• Time travel - reconstructing state of the data at a point in time in the past
• Data sync – one and two way sync’ing
• Repairing row-level data - recovering from accidental data changes and
application errors
TRACKING DATA CHANGES
OVERVIEW
• Part of the ANSI SQL 2011 standard. First introduced with SQL Server 2016
• Designed to keep a full history of data changes at the row level using a
‘current’ table and a history table
• System-versioned because the start and end dates for each row is managed by
the system
Good for
• Audit trail
• Point in time analysis
• Maintaining a data warehouse
• Repairing row-level data
Not so good for
• Data synchronization
• Short term data history requirements
• Column level change tracking
• Tables that have frequent schema changes
TEMPORAL TABLES - DEMO
• First introduced in SQL 2008, Change Data Capture (CDC) captures DML
changes on a tracked table. Enterprise only unless you have SQL 2016 SP1
• Uses SQL Server Agent jobs to asynchronously read the transaction log to track
and record DML transactions
• When enabling CDC on a table, an additional table is created containing the
same columns as tracked table along with metadata needed to understand the
changes
• Table-valued functions are also created to allow access the change history
• CDC was written by the SQL Server Replication team so it works with
transaction replication
CHANGE DATA CAPTURE
Good for
• Maintaining a data warehouse
• Short term audit trail
• Data synchronization (using net changes)
How it works
SQL Server Agent jobs
• One used to populate the change tables
• One responsible for change table cleanup
• SQL Server Agent must be running for data capture and clean up
to work
Transaction logs
• SQL Server Agent jobs need to be running for the transaction log
to get truncated
• Change data is not available until after change is committed to
source table and capture job has processed the related log entries
CHANGE DATA CAPTURE - DEMO
Validity interval
• Changes captured by CDC has a finite lifespan called the validity interval.
• Important because the extraction interval for a request must be fully covered by this time period.
• Default is 3 days measured in mins
Metadata
• __$start_lsn identifies the commit log sequence number (LSN) that was assigned to the change.
• __$end_lsn not supported. Future compatibility not guaranteed. Column always NULL (SQL 2012)
• __$seqval used to order changes that occur in the same transaction
• __$operation records the operation associated with the change: 1 = delete, 2 = insert, 3 = update
(before), and 4 = update (after)
• __$update_mask variable bit mask with one defined bit for each captured column.
• Insert and delete - all bits set. Update - bits set correspond to changed columns
CHANGE DATA CAPTURE - DEMO
Requirements
• Does not play nice with memory-optimized tables
• SQL Server Agent must be running
• Net changes: source table has a primary key or an unique index referenced by
index_name when setting up CDC on the table.
CHANGE TRACKING
How it works
• Once enabled, any DML statement will cause change tracking information for
each modified row to be recorded.
• Each table enabled for Change Tracking has an internal on-disk table created
on the same filegroup. This table is used to record table change versions and
the rows that have changed since a particular version.
• When a change occurs the row’s primary key, list of impacted columns
(optional) and DML command are recorded in the internal tracking table.
• An auto cleanup thread purges expired content of the internal tracking tables
based on the retention period setting of the database.
CHANGE TRACKING
Good for
• Data synchronization
Requirements
• The database compatibility level must be set to 90 (SQL 2005) or greater.
• Snapshot isolation is strongly recommended to help ensure that all change
tracking information is consistent.
• Tables being tracking must have a primary key.
• DDL operations are not tracked
• Deletes caused by a TRUNCATE statement are not tracked. However, this will
cause the minimum valid version to be updated. Client applications would
then need re-initialize their data.
TRIGGERS
Good for
• Auditing
• One audit table per user table or one audit table for all user tables
• Anything if you want to put the work into building the solution
• Any environment that is SQL Server 2005 or earlier
The tools available natively in Microsoft SQL Server to track data changes have
gotten better and better over time. With SQL 2016 SP1 most if not all these tools
are available to everyone with a SQL standard license or better. There is really
no reason why you should be suffering with a system that does a poor job of
tracking changes in your DB.
In addition to the tools that are offered natively in SQL Server there are also 3 rd
party venders that provide off the shelf solutions for tracking data changes. For
example, one of the sponsors of today’s event ApexSQL has a product called
ApexSQL Audit that tracks and reports events on SQL Server by auditing access
and changes to the SQL Server instance and its objects.