Glossary Terms›Change Data Capture (CDC)

What Is Change Data Capture (CDC)?

Change data capture (CDC) is a software process or technology that identifies and tracks changes to data stored in a database, such as inserts, updates, and deletes. While a database is useful for storing the latest state of data, CDC preserves the various states of data over time by providing an audit trail, and it can provide incremental changes to other repositories or applications.

In a very basic example, CDC enables you in December of a given year to look up your home address as of January, even if you had moved in between, and your address in the database reflects the current value.

The change data capture process (CDC) via the publisher/subscriber method. Multiple databases and applications can subscribe to the change data.

How Does Change Data Capture Work?

CDC delivers data on records that changed for database functions such as inserts, updates, and deletes, and makes a record of that change available either within a database itself or to other applications that rely on the data. Change data capture tools typically rely on the database’s transaction log, which keeps track internally of record changes for the purposes of system recovery. Change data capture tools leverage that information to deliver database changes to an external system.

What Are Common CDC Methods?

There are different approaches that a system can use to capture changes in data. The use of timestamps is one of the most popular methods of CDC, as most systems track when a row was created and most recently modified.

Database transaction logs are also a resource for CDC. Log scanners can identify any changes in these transaction logs. As long as the log scanner can interpret the log, this can be an ideal solution for CDC because it has little impact on the underlying database, delivers changes with low latency, and ensures transaction integrity because every change is tracked in order.

As event streaming has gained popularity, so has the use of the publish/subscribe model of change data capture, where a database triggers log or publish change events to a table and shares those changes with the CDC system. The series of updates that CDC delivers looks like a stream of data, making stream processing engines (like Hazelcast Jet) a suitable technology for consuming CDC data.

Other methods of CDC look at version and status numbers on rows.

Is ETL a Method of Change Data Capture?

ETL—the process of extracting, transforming, and loading of data—can often bring new data or updated data from a source system to a database or other application. However, ETL is not a change data capture process, as ETL is typically used to move data from one location to another with a transformation during the migration. If an ETL process is used to merely make an exact, up-to-date copy of a data store to another location, CDC can be used instead. This way, CDC can reduce the necessary resources that would otherwise be used by ETL processes because it only applies to data changes. So rather than pulling all data from a source system and recreating a database table from scratch, for example, a CDC process can identify only the new and changed data and propagate those additions and changes to the destination system.

Back to Glossary Terms

Hazelcast Office Hours: Hazelcast Platform 5.4 Release – Deep Dive into Advanced CP / CP Map

Blog

The New CP Subsystem Map

Hazelcast Platform 5.4 introduces a map data structure to the CP Subsystem – the CP Map. In this blog, we’ll…

Blog

Hazelcast Platform as the Core Technology for Event-Driven Microservices

In the ever-evolving world of IT, scalability and agility are key factors of an architecture that delivers a competitive advantage….

Webinar

/ Video

/ 60 min

Why Hazelcast?

Forrester names Hazelcast as a Strong Performer

Key Capabilities

Products

Tool Kit

Quick Links

Unlock your perfect plan with our flexible pricing

Key Solutions

By Industry

By Use Case

By Architecture

Solutions

By Industry

By Use Case

By Architecture

Join us for a deep dive into Hazelcast Platform's capabilities

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

What Is Change Data Capture (CDC)?

How Does Change Data Capture Work?

What Are Common CDC Methods?

Is ETL a Method of Change Data Capture?

Related Topics

Further Reading

Spread the Word

Keep Reading

Tell-U-Vision Episode 06: Hazelcast – CAP Theorem

Best Practices for Data Platforms with Finanz Informatik and Hazelcast

Hazelcast Office Hours: Hazelcast Platform 5.4 Release – Deep Dive into Advanced CP / CP Map

The New CP Subsystem Map

Hazelcast Platform as the Core Technology for Event-Driven Microservices

Hazelcast Office Hours: Using the Hazelcast Platform To Implement gRPC Based Microservices

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect