Blog ›Design Considerations When Using Transactionality

By Neil Stevenson

Principal Architect

With over 30 years of industry experience, Neil has designed, developed, debugged, and supported software systems for numerous customers large and small. Initially, a C and assembler programmer, most of the last 20 years have been Java-based, with a focus on distributed systems, data grids, and stream processing. Neil is an occasional committer to the Hazelcast code base, with special interest on GoLang.

View all blogs by the author

Sep 8, 2022

Back to Blog

Design Considerations When Using Transactionality

Database transactions often underpin online business transactions. The database transactions are the brakes if we compare business transactions to race cars. Just as the fastest driver is the one that uses the brakes the least, the quickest business transactions are the ones that depend on database transactions the least.

And, of course, you can’t eliminate brakes, as you still need to worry about safety. So, the goal is to use the right amount of braking (or, in our case, the right amount of database transactions).

This blog post will examine how your application can be faster without sacrificing safety.

Action and transaction

We will follow a familiar scenario, moving money between bank accounts.

The “Action” is the business event, the logical view.

Move $10 from account A to account Z.

The “Transaction” is the technical event, the implementation. This may include commit or rollback, be ACID or BASE, and is a way to ensure correctness.

The coding choice

The outcome of the action is that account A has $10 less, and account Z has $10 more.

A transaction is required if two conditions hold.

It is implemented as a two-step operation.
You wish it to appear as a one-step operation. (Atomicity)

Condition 1 is just the choice of book-keeping method.

Condition 2 is usually described as a requirement, but it may just be a wish.

Let’s review the alternatives.

Single-entry and double-entry accounting

As background, consider the classic accounting systems.

Single-entry dates to 3000BC; double-entry is newer, from 1494AD.

The single-entry system might have this list of actions for $50 deposits and the $10 transfer from above.

Item 1 : Account A : $50 deposit
Item 2 : Account Z : $50 deposit
Item 3 : Account A : $10 transfer : to : Account Z

In the double-entry system, actions are recorded twice. Each account has its list of actions.

Account A:
Item 1 : $50 deposit
Item 2 : $10 transfer : to : Account Z

Account Z:
Item 1 : $50 deposit
Item 2 : $10 transfer : from : Account A

Double-entry is easier for humans as the volume of actions increases. Computer systems typically mirror business processes, and the implementation where one action requires two data updates naturally follows.

Implementation 1 – double-entry with transactions

Following from above, what is wrong with the classic approach, the ACID transaction?

Here it would be:

Start transaction.
Decrement account A balance by $10.
Increment account Z balance by $10.
Commit.

It’s an all-or-nothing approach and is very appealing.

Both accounts show their previous balance; then both show the new balance.

In the rare event of some IT crash, the transaction is rolled back.

So let’s review two things that are wrong with this approach.

Speed

Update access to account A and account Z needs to be suspended for everyone else for the duration of updating both data records.

If you were to code this yourself, you’d use locks. Locks stop other processing, hence impact on application speed.

Nothing else may be trying to update accounts A or Z at this time, so you might think nothing is delayed. But there is still the time cost to lock and unlock.

A transaction essentially just locks, handled for you.

Correctness & Isolation

Imagine while we ran the above transaction that someone else ran a query summing account balances, a scan of all account records.

First, you might start your transaction.

Then the query might obtain the balance of account A. Your transaction is incomplete, so the query gets the old value for account A ($50).

Then your transaction completes successfully.

Then the query might obtain the balance of account Z. Your transaction is complete, so the query gets the new value for account Z ($60).

So the query returns $110 instead of $100, even though your update was transactional.

Here the transaction has “write” isolation. Both writes happened atomically, so the transaction has correctness. But a concurrent read from another place has incorrectness. “Read” isolation would stop the concurrent read while the transaction runs, meaning the data can only have one user at a time, which is unacceptable.

There are many other such scenarios exposing logical flaws in transactions.

Implementation 2 – double-entry without transactions

Imagine we removed the transaction wrapper from the above, so two independent updates to accounts for the one action.

How do we win? What do we lose?

We win obviously on speed. There are now no locks needed.

Concurrent queries are no better or worse than before. The query may still return $100 or $110 as a race condition.

What we think we have lost is guaranteed consistency, but have we?

Guaranteed Consistency

The worry in the above scenario is account A is updated, and the system fails, so account B isn’t updated.

A transaction stops this, despite its other problems, as already noted.

What a transaction guarantees are immediate consistency.

Eventual consistency will frequently be acceptable.

We would expect failures to be rare. We would expect to know about them promptly. So on any such failure, we just run a one-off process to complete any half-done business action.

Implementation 3 – single-entry

The third approach would be to implement single-entry accounting, as machines can handle this at scale even if humans can’t.

Reviewing what we saw before, this is just an event journal!

Item 1 : Account A : $50 deposit
Item 2 : Account Z : $50 deposit
Item 3 : Account A : $10 transfer : to : Account Z

Each line item is a single line, either written or not.

We have consistency and no need for locks.

If we wish to know the current balance of account A, it’s just a query against the event journal. But if there is a lot of data, this query may take enough time to run that it is noticeable to the human eye. So we might instead go with a materialized view.

What is stored?

To review, what is stored with the different approaches?

We will store records for each account and all their transactions.

For double-entry, the account holds the current balance.

For single-entry, the account may not hold the current balance; we might calculate it when needed from the transactions.

Or, for single-entry, we might refresh the account with a balance using a materialized view.

Materializing a view

For our single entry, we might refresh the balance continuously or periodically.

A stream processing job could observe the event journal. When a new action is written, the affected accounts can be updated.

Or, a scheduled task could scan the event journal to do a similar thing.

For either mechanism, we need to consider failure. If we replay events, we need to know whether or not to apply them. Events are sequential, so this is as simple as recording the last sequence number that the balance relates to.

Consistency once again

In many of the approaches above, when the action has been applied, the balances for account A and account Z do not update at precisely the same time. One is soon after the other.

From a bank customer’s perspective, this is fine.

It wouldn’t be unknown for account A and Z owners to know each other since one sends money to the other. Account A’s owner would see the cash remaining. If account Z’s owner doesn’t see it arrive immediately but does see the funds come pretty soon, that’s ok. More than a few minutes or even seconds would be poor by modern standards.

Reconciliation is the safety net. A process or process applies the actions to the accounts. If they fail, we can rerun them. But it’s only software, so there should always be distrust. A diligent bank will have cross-checks running anyway to ensure everything has been applied at least by the end of the working day.

Summary

Transactions slow down processing. Transactions do not ensure correctness in the broader sense. They run correctly but do not guarantee that others do not see inconsistent data.

If you choose a single-entry approach, you don’t need transactions. If you choose a double-entry approach, you might or might not need transactions. It depends on the consistency model you can agree upon with the business user.

Keep Reading

Blog

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Network hops cost milliseconds; milliseconds cost money. Put data, compute, and AI on one platform, and both bills shrink, whether…

Blog

Understanding the Value of Distributed Compute

Introduction Hazelcast is a powerful platform. It delivers the power of a highly reliable, distributed cache. Equally important is the…

Blog

Resilience That Holds Under Load: Hazelcast Platform 5.7

A major release for institutions where the operational state must remain correct during degradation, not just be restored afterward. The…

Blog

Testing distributed resilient applications powered by Hazelcast

Applications powered by Hazelcast and that use it to drive business logic need tests that go beyond happy-path validation. Serialization,…

Datasheet

/ PDF

/ 2 pages

Resilient, Continuous, Active Data – without Compromise Datasheet

The unified in-memory and stream processing platform for resilient, continuous active data at sub-millisecond speed.

Webinar

/ Video

/ 45 min

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Zero-downtime upgrades aren’t the hard part—schema evolution is. Learn how mixed service versions interact with shared cached data, why subtle inconsistencies cause failures, and how to design forward-compatible changes using Hazelcast and real Java examples.

Platform

Cloud Deployment Options

Key Solutions

By Industry

By Use Case

By Architecture

A cloud-agnostic architecture for your applications

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

A cloud-agnostic architecture for your applications

By Neil Stevenson

Spread the Word

Design Considerations When Using Transactionality

Action and transaction

The coding choice

Single-entry and double-entry accounting

Implementation 1 – double-entry with transactions

Speed

Correctness & Isolation

Implementation 2 – double-entry without transactions

Guaranteed Consistency

Implementation 3 – single-entry

What is stored?

Materializing a view

Consistency once again

Summary

Keep Reading

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Understanding the Value of Distributed Compute

Resilience That Holds Under Load: Hazelcast Platform 5.7

Testing distributed resilient applications powered by Hazelcast

Resilient, Continuous, Active Data – without Compromise Datasheet

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect