Change data capture (CDC) is a feature available only in Enterprise edition of SQL Server 2008 R2. It can be used for DML auditing purpose or to move data changes that occurred in one database to a different database. Often this is the case when you want to keep away the reporting database (OLAP) from your transactional database (OLTP) to reduce performance issues.
By default CDC is not enabled on the database hence it needs to be enabled on the database level and then on the tables that you want the changes to be captured and pushed to the reports database. Please note that the command to enable CDC on database EXEC sys.sp_cdc_enable_db would return the following error
• If the database is backed up from an instance of SQL Server 2008 R2 and restored on another computer.
• If the database is attached from a different instance with a detached copy from another computer.
• If the statement or the module is executed as the dbo user.
• If the owner of the database is a domain user or a SQL Server authorization login.
Could not update the metadata that indicates database AdventureWorks2008R2 is enabled for
Change Data Capture. The failure occurred when executing the command ‘SetCDCTracked(Value = 1)’.
The error returned was 15517: ‘Cannot execute as the database principal because the principal “dbo”
does not exist, this type of principal cannot be impersonated, or you do not have permission.’.
Use the action and error to determine the cause of the failure and resubmit the request.
To fix this we will need to change the database owner to sa.
EXEC sp_changedbowner 'sa'
Below I have attached the screenshots that specify the commands that are used to enable change data capture on database and the source tables(tables that we want to implement CDC). Usually DBAs use SSIS packages or stored procedures to pull data from CDC tables(tables into which the changes are pushed by CDC).
One interesting thing with CDC tables is that they are not directly visible and we have to use CDC functions to query the CDC tables. There are 2 types of these functions
1. cdc.fn_cdc_get_net_changes_<schemaname>_<tablename> 2. cdc.fn_cdc_get_all_changes_<schemaname>_<tablename>
one would retrieve only the net changes from the CDC tables and the other would retrieve all changes from the CDC tables. Inserts, deletes are recorded only once in the CDC tables unlike updates which are inserted twice representing the record before the update and after the update. The functions take the the initial and final log sequence numbers as parameters along with the ‘all update old’ or ‘all’ parameter. The ‘all update old’ retrieves both before and after records for an update where as ‘all’ retrieves the records that are updated only.
CDC uses 2 sql agent jobs.
1.cdc._capture: This job runs always to ensure that all changes in the source table are captured in the CDC tables. It keeps calling the sys.sp_MScdc_capture_job.
2.cdc._cleanup: This can be put on a schedule but by default runs everyday once. This job calls the sys.sp_MScdc_cleanup_job whenever it is run.
The commands to enable the CDC on database and tables and commands to call the CDC functions to view the data changes are as shown in the below screenshots. The last screenshot has the CDC capture and cleanup sql agent jobs.
Hope this helps!