Over the years, I created several custom solutions to help identify Data Manipulation Language (DML) changes and track historical changes. While effective, these solutions were sometimes cumbersome to use and maintain. SQL Server 2012 eliminates the need for such custom solutions, making life easier for everyone. In SQL Server 2012, there are two features that can help you track changes the painless way: change data capture and change tracking.

For the most part, the change data capture and change tracking features are interchangeable, with one exception. Both features utilize a synchronous tracking mechanism, which means SQL Server is minimally impacted. Table 1 shows what can be tracked with the two features. As you can see, the main difference between them is historical data tracking. Only the change data capture feature lets you track the actual data that was changed.

Item Tracked Description Change Data Capture Feature Change Tracking Feature
Table 1: What the Features Can Track
DML changes Inserts, updates, or deletes Yes Yes
Historical data Data changes over time Yes No
Column changes Add, alter, or drop columns Yes Yes
DML type Inserts, updates, or deletes Yes Yes

The Logistics

Before you can use the change data capture and change tracking features, you need to do the following.

Execute sys.sp_cdc_enable_db. Although the tracking will be taking place at the table level, you need to make the database aware of this capability. You can do so by executing the sys.sp_cdc_enable_db system stored procedure, as shown below:

EXEC sys.sp_cdc_enable_db;

Execute sys.sp_cdc_enable_table. You need to make the table you want to track aware of the tracking by executing the sys.sp_cdc_enable_table system stored procedure. Here's its syntax:

EXEC sys.sp_cdc_enable_table
  [ @source_schema = ] 'source_schema',
  [ @source_name = ] 'source_name',
  [ @role_name = ] 'role_name'
    -- Specify the gating role if gating is desired
    -- or set to NULL if you don't want gating.
  [,[ @capture_instance = ] 'capture_instance'
    -- Name instance-specific changes.
  [,[ @supports_net_changes = ] supports_net_changes ]
  [,[ @index_name = ] 'index_name' | NULL ]
  [,[ @captured_column_list = ] 'captured_column_list' |
    NULL ]
    -- Use NULL if you want all columns captured.
  [,[ @filegroup_name = ] 'filegroup_name' ]
  [,[ @allow_partition_switch = ] 'allow_partition_switch' ]

As you can see, you need to provide information such as the source table's schema and name, who has permission to access the change data, and the gating role member if gating is desired. For more information about the syntax, see "sys.sp_cdc_enable_table (Transact-SQL)" in SQL Server 2012 Books Online (BOL).

After tracking is enabled on the table and database, information about any changes made to that table is written to system tables in that database. The tables will begin with the schema name cdc. In addition, the dbo.systranschemas table is automatically created to track replication schema changes.

Let's Get Started

To introduce you to the types of information that you can capture, I wrote a script that:

  • Creates a database named Learn_CDC.
  • Creates and populates a table named MyCDCPlay.
  • Enables the change data capture feature on the Learn_CDC database.
  • Enables the change data capture feature on the MyCDCPlay table.
  • Adds data to the MyCDCPlay table, which will be tracked by the change data capture feature.

Listing 1 contains this script, which you can download by clicking the Download the Code button near the top of the page.

Listing 1: Script to Create the Sample Database and Table
CREATE DATABASE Learn_CDC
GO

USE Learn_CDC
GO

CREATE TABLE MyCDCPlay (ID INT IDENTITY(1,1) PRIMARY KEY,
  cdccol1 VARCHAR(10) NOT NULL,
  cdccol2 CHAR(3) SPARSE NULL,
  cdccol3  VARCHAR(20) SPARSE NULL);
GO

INSERT INTO MyCDCPlay (cdccol1,cdccol2, cdccol3)
VALUES ('CDC1Play', 'YES', 'Some cool stuff')
GO

INSERT INTO MyCDCPlay (cdccol1)
VALUES ('CDC2Play')
GO

-- Make sure you have data.
SELECT * FROM MyCDCPlay;

-- Enable CDC on database.
USE Learn_CDC
GO
EXECUTE sys.sp_cdc_enable_db;
GO

-- Enable CDC on table. Make sure that
-- SQL Server Agent is running.

USE Learn_CDC;
GO
EXECUTE sys.sp_cdc_enable_table
  @source_schema = N'dbo'
  , @source_name = N'MyCDCPlay'
  , @role_name = NULL
GO

-- Insert more data, which will be tracked.

INSERT INTO MyCDCPlay (cdccol1,cdccol2, cdccol3)
VALUES ('CDC4Play', 'No', 'Some cool stuff')
GO

INSERT INTO MyCDCPlay (cdccol1)
VALUES ('CDC5Play')
GO
-- Make sure you have data.
SELECT * FROM MyCDCPlay;

After you run the code in Listing 1, you'll see a folder structure similar to that shown in Figure 1.

Figure 1: Reviewing the System Tables Created by the Change Data Capture Feature

How to Access the Information About the Tracked Changes

As I mentioned previously, the script added data to the MyCDCPlay table after the change data capture feature was enabled. To see the information that was captured by the change data capture feature, you can query the system tables created by the feature. Note that SQL Server 2012 BOL recommends that you don't query system tables directly. Table 2 provides the stored procedures and functions that SQL Server 2012 BOL recommends you use to query these system tables.

System Table Query With
Table 2: Recommended Ways to Query the cdc System Tables
cdc.captured_columns sys.sp_cdc_get_source_columns stored procedure
cdc.change_tables sys.sp_cdc_help_change_data_capture stored procedure
cdc.<capture instance>_CT cdc.fn_cdc_get_all_changes_<capture_instance> and cdc.fn_cdc_get_net_changes_<capture_instance> functions
cdc.ddl_history sys.sp_cdc_get_ddl_history stored procedure
cdc.index_columns sys.sp_cdc_help_change_data_capture stored procedure
cdc.lsn_time_mapping sys.fn_cdc_map_lsn_to_time and sys.fn_cdc_map_time_to_lsn system functions

To start, you can query the cdc.captured_columns table to see the information about the columns that were tracked. Figure 2 shows the results of this query.

Figure 2: Learning About the Columns That Were Tracked

Querying the cdc.change_tables table gives you information about the table. Figure 3 shows the results of this query. Note that the capture_instance parameter wasn't specified when the table was enabled with sys.sp_cdc_enable_table, so the table uses the default schema and the default table name.

Figure 3: Learning About the Table That Was Tracked

Querying the cdc.dbo_MyCDCPlay_CT table gives you information about the data changes. As Figure 4 shows, this is the information that was specified during the two INSERT operations.

Figure 4: Learning About the Data That Was Changed

At this point, let's update the data in the MyCDCPlay table by running the command:

UPDATE MyCDCPlay SET cdccol2 = 'be'
  WHERE cdccol1 = 'CDC4Play'

Figure 5 shows the results. Notice that the results include both the historical data and new data. You can collate them to know exactly what changed.

Figure 5: Updating the Data in the Table

Querying the cdc.ddl_history table gives you information about any schema changes. However, in this case, the script didn't make any schema changes, so you wouldn't get any results. Let's make some schema changes to see what happens:

ALTER TABLE MyCDCPlay ADD cdccol5 varchar(30) Sparse NULL
ALTER TABLE MyCDCPlay ALTER COLUMN cdccol1 VARCHAR(50)
  NOT NULL

If you now query the cdc.ddl_history table, you'll get results like that in Figure 6. As you can see, the schema changes are captured, including when the changes occurred.

Figure 6: Learning About the Schema Changes

The last three tables you can query are:

  • cdc.index_columns. Querying the cdc.index_columns table provides information about the indexed columns associated with the table that was changed.
  • cdc.lsn_time_mapping. Querying the cdc.lsn_time_mapping table provides information about the time mapping for the changes.
  • dbo.systranschemas. Querying the dbo.systranschemas table shows you the schema changes and the type of schema that was changed.

The Downside

There are a few shortcomings with the change data capture and change tracking features. Here are some shortcomings that I've come across:

  • The features provide a table-by-table solution. If you have an environment with hundreds tables, it would be cumbersome to implement. This usually pushes people to use third-party tools.
  • Information about the user who made the change isn't captured. Microsoft recommends the use of SQL Server's audit features for this capability.
  • Synchronous operations contribute to slow data commits.
  • Changes to computed columns aren't tracked. In addition, there are limitations when tracking changes in columns containing XML, Sparse, Timestamp, and BLOB data types. For more information about these limitations, see "Track Data Changes (SQL Server)" in SQL Server 2012 BOL.

Document Changes and More

The change data capture and change tracking features let you easily track changes to your tables. You can use these features to not only document changes but also create back-end systems for applications. For example, you could use the features as part of a parcel tracking system that keeps track of where parcels are located at any given point during the shipping process. You could even include a mechanism to evaluate the system and determine where efficiency can be improved.

*************************************************************************************

Leonard MwangiLeonard Mwangi is a design architect with Alexander Open Systems, concentrating mostly on information worker solutions and applications integration. Leonard has more than 12 years of experience in the IT industry, primarily focusing on SQL Server and SharePoint.
Twitter: @lmundia
LinkedIn: http://www.linkedin.com/in/lmundia
Blog: http://www.geekswithblogs.net/leonardm