Case Study

Hazelcast IMDG Integrates with Apache Cassandra to Deliver Fast, Scalable IoT Data Platform for Future Grid

About Future Grid

Future Grid

Factoring is a financing model whereby a finance company purchases another company’s outstanding invoices to provide funds quickly without having to collect the debt. Acquiring loans can be onerous and take a long time to set up. Without factoring, small businesses wouldn’t be able to survive.

The factoring market size value is approximately $3,235.88 billion (USD) globally, as of 2020 with a projected growth to $5,384.00 billion (USD) by 2027.

What’s this got to do with Hazelcast IMDG?

Future Grid has been working with several Australian utility companies to automate the processing of sensor and smart meter data which crosses energy networks. To put this in context, its customers are collecting approximately 3 billion data points every day. In terms of daily post processing this equates to 20 billion records as each record has multiple, individual data points - a massive scaling challenge.

To make the most of this information, customers need a real-time data aggregation solution which enables them to make complex real-time decisions. When Future Grid first tried to solve this problem it used traditional relational databases. However, it soon became apparent that inherent problems existed. Traditional databases weren’t designed to process huge volumes of data in real-time, the main issue being that they can’t execute algorithms against incoming data fast enough. Therefore, Future Grid decided to build its own solution using Hazelcast IMDG.

The Future Grid Solution

IMDGs are designed to provide high availability and scalability by distributing data across multiple machines. The rise of cloud, social media, and IoT has created demand for applications that need to be extremely fast and capable of processing millions of transactions per second. The Hazelcast IMDG computing platform helps companies manage their data and distribute processing using in-memory storage and parallel execution for breakthrough application speed and scale. It is easy to work with and brings a highly resilient and elastic memory resource to applications. It is also one of the most widely adopted open source solutions. Crucially for Future Grid, Hazelcast IMDG enables organizations to free their data from slow, expensive, and hard to scale relational databases. With Hazelcast IMDG, the database remains the system of record, but bottlenecks disappear.

Many companies that historically would not have considered using in-memory technology because it was cost prohibitive are now changing their core systems’ architectures to take advantage of the low latency transaction processing that in-memory technology offers. In part, this is due to the price of RAM dropping significantly – it has become economically feasible to load the entire operational dataset into memory with performance and speed improvements of over 1000x. Future Grid took the decision to build its platform from the ground up incorporating Hazelcast IMDG.

Chris Law, Co-Founder and Managing Director at Future Grid, explains: “Hazelcast is feature-rich and has a number of key capabilities which were particularly relevant to what we needed. Its ability to offer grid computing and shared nothing architecture were of paramount importance. Alongside that, it’s quick and based on an open source license model. In a pilot it delivered outcomes in ten minutes which were taking two hours with the incumbent technology. This blew the customer away.”

The key customer challenges Future Grid was looking to overcome were:

  • Extreme data volumes and speed
  • Reliability and resilience
  • License costs at scale
  • Real-time time-series

The solution developed by Future Grid, combines Hazelcast IMDG with Apache Cassandra’s persistence data store capabilities.

Integrating Apache Cassandra and Hazelcast IMDG into the Future Grid Platform

Law continues: “We implemented Hazelcast IMDG at the core of our products in-memory capability, while also integrating it with a range of purpose built technologies to deliver the platform our customers required. For example, Hazelcast IMDG is integrated with Apache Cassandra which provides internal data storage in regard to reference data while maintaining a distributed grid architecture. We found integrating Hazelcast with Cassandra was a very straightforward process.”

Cassandra is a distributed database for managing large amounts of structured data across many commodity servers, while providing highly available service and no single point of failure. It offers capabilities that relational databases and other NoSQL databases cannot provide such as: continuous availability, linear scale performance, operational simplicity and easy data distribution across multiple data centers and cloud availability zones.

For Future Grid, Cassandra’s persistence capabilities were pivotal. In the context of storing data in a computer system, persistence means that data survives after the process with which it was created has ended. Future Grid recognized that in-memory approaches can achieve blazing speed, but it can be limited to a relatively small data set. Therefore, Future Grid amalgamated the strengths of the two open source solutions for the energy use case.

The fundamental limitation of Cassandra is that it is disk-based, not an in-memory database. This means that read performance is always capped by I/O specifications, ultimately restricting application performance. What can be processed on an in-memory system in a single minute would take forever on a disk-based system. Integrating Hazelcast IMDG with Cassandra makes more data available and effective. Importantly, the combined solution maintains the high availability and horizontal scalability of Cassandra, while delivering performance that is 1000x faster than disk-based approaches due to Hazelcast IMDG.

Contextual Awareness

Future Grid provides context to real-time time series (a series of data points indexed, listed or graphed, in time order). Through customer feedback, it became apparent that incumbent technology could not produce visuals fast enough – there was just too much data to process. By being able to produce times series graphs via its platform, Future Grid has solved many of the visualization challenges that the energy sector was facing. Engineering is now able to drill down into operations to act on data in a timely manner – a feature which has been immensely popular with customers.

Who is using the platform?

Today three utilities in Australia are using Future Grid. Prior to using the platform, the companies considered relational database vendors to perform advanced grid analytics.

One in particular developed several use cases based on both half hourly interval data and five minute power quality data from its fleet of smart meters. Using the database, simple derivations and substation aggregation proved difficult to deliver due to the volume of data. Investigations revealed that the relational database architecture was difficult to tune and couldn’t meet these simple use cases. To meet the set of initial use cases would have required an additional $10MM in hardware ($5MM in production and $5MM for disaster recovery).

In a three month pilot for a utility company, Future Grid was 1200% faster and reduced insight output from 2hrs to 10 minutes on significantly less hardware. This algorithm in production used $40,000 of hardware reducing infrastructure costs alone by $10MM.

For the three utility companies use cases include:

  • Power quality, interval and event derivations: clean de-duplicate five minute power quality data and daily per device “rollup” that includes pre-calculations to make further analysis faster and more accurate.
  • Loss of Neutral Detection: using machine learning and fast data processing to monitor and predict safety issues, reducing shock instances significantly.
  • Phase based substation aggregation: transformer modeling using aggregate meter interval data to provide better visibility per phase substation usage. Used for long term asset planning, phase balancing and alerting of exceeding designed rating.
  • Customer Phase Cross referencing: using machine learning to investigate data correctness of meter to substation mappings including a responsive, real-time visualizations solution

For electricity networks, decisions need to happen in real time. Any delay is a potential safety or customer supply issue. Future Grid’s processing technology has increased data transaction rates significantly using minimal hardware.

Final Word

The energy industry is undergoing significant changes. Not only is IoT creating new, connected environments, but renewable technologies are changing grid management. Law describes the result as the “Internet of Energy”, where sensors connect and control energy grids, balancing traditional and off-grid supply for consumers.

Law concluded: “Using Hazelcast IMDG has enabled our customers to realize the dream of real-time data without the significant cost of traditional relational database models. Out of the box speed and resilience have helped our customers deliver operationally critical production systems.”