Blog ›Distributed Systems Covering Edge-to-Cloud (Part 2)

By Lucas Beeler

Principal Architect

Lucas Beeler is a Principal Architect at Hazelcast, where he wears a wide variety of hats. In addition to helping Hazelcast's most demanding customers architect, design, and operationalize enterprise software systems based around the Hazelcast Platform, he has responsibilities in reference solution development and outbound product management. Lucas holds a B.S.E. in computer science from the University of Michigan.

View all blogs by the author

Jun 16, 2020

Back to Blog

Distributed Systems Covering Edge-to-Cloud (Part 2)

Note: This post is part 2 of 2 on edge-to-cloud. You can read part 1 here.

Inside the Gateway: Edge-to-Cloud Stage 1

The edge-to-cloud pipeline begins inside the gateway, where two key tasks have to be performed:

Data from sensors and devices must be captured in its raw form
Sensor and device data must be aggregated and/or canonicalized to reduce its payload size and to make it conform to standardized, semantically meaningful formats

Inside the Gateway: Data Capture

During raw data capture, the first component of the first phase of an edge-to-cloud architecture, the Hazelcast In-Memory Computing Platform already begins to show its usefulness. First, Hazelcast is small—the entire platform is contained in a single JAR file of about 15MB—making it ideal for deployment on edge gateway devices.

Second, Hazelcast IMDG, the in-memory data grid component of the Hazelcast platform, is ideal for capturing the kinds of heterogeneous data generated by edge computing devices. Unlike a traditional database, Hazelcast IMDG is schema-free and operates as a key-value store with advanced capabilities for indexing and querying. Since Hazelcast is flexible and imposes no schema on its data, device data captured from devices made by different manufacturers over several decades can be mixed into a unified store. Data can be keyed on any unique value. These keys can be intrinsic to the data, such as a sample UUID grep’d through text processing, or keys can be extrinsic to the data, such as a timestamp recording when the data was captured.

What’s more, IMDGs, being pure in-memory solutions, don’t require flash or SSD access as part of their data write operations. This means that even relatively constrained devices like IoT gateways can cope with high data ingest rates with very low latencies. For initial data capture from edge devices, it’s hard to beat an IMDG.

Inside the Gateway: Aggregation and Canonicalization

But data captured in its raw form, as output from sensors and devices, isn’t terribly useful. First, the data is in whatever format the device produced, which is probably not semantically meaningful to you from a business perspective. Second, there tends to be a lot of it—if a sensor produces a sample at 30Hz, you get 30 data samples every second, even if almost all of these samples show very little change over small time windows.

What’s needed are two further processes. First, we need to aggregate the data, such as by transforming a series of fine-grained samples into coarser-grained (and more manageable) averages. Second, we need to convert the data from the format of the raw sensor or device into a format that is meaningful and useful. This second step goes by various names, such as standardization, normalization, or canonicalization, but it’s essentially an ETL process on a continuous stream of data.

When we think of ETL tools, we tend to think of big, sophisticated, expensive systems like Informatica PowerCenter that are designed to run in corporate data centers. But at the edge, you need something API-driven and code-oriented rather than the GUI-focused ETL tools that you’re accustomed to. The Hazelcast Platform offers a lightweight alternative: packaged inside the same 15MB platform JAR file as Hazelcast IMDG is Hazelcast Jet, a third-generation stream processing tool that runs blazingly fast even in a resource-constrained environment like an edge gateway.

By harnessing Hazelcast Jet, data stored in the underlying Hazelcast IMDG storage layer can be continuously transformed from key-value maps of raw device data into key-value maps of meaningful, canonicalized data.

From Edge-to-Cloud: Data Transport

Once device data has been canonicalized into a format that will make sense to business decision makers and data scientists, it’s time to move it from the edge into the cloud or data center. Thankfully, with the Hazelcast Platform, this usually means a simple flick of a switch. Every Hazelcast Platform instance comes with the ability to replicate data in an eventually consistent, asynchronous manner to other, geographically distinct Hazelcast Platform instances, even over slow or unreliable WAN links. This feature, called Hazelcast WAN Replication, comes with myriad configuration options, but in many cases, it’s sufficient simply to turn it on and let the default settings do the rest.

Hazelcast WAN Replication seamlessly moves data updates from the edge back to the cloud. But you might be asking yourself: can I really use the same data storage technology on beefy servers in the cloud that I use out on resource-constrained edge devices? The answer is an emphatic yes.

Just ask large Hazelcast customers like JPMorgan Chase and UBS. These institutions run Hazelcast clusters that house terabytes of data. What’s more, the Hazelcast platform is built for easy deployment on cloud-native PaaS technologies like OpenShift and Kubernetes.

In the Cloud (or Data Center): Making Sense of Information

Once aggregated and canonicalized device data has made its way from the edge back to the cloud, the Hazelcast Platform becomes an incredible tool for unlocking the value of that data. Since the returned edge data will arrive in a Hazelcast IMDG instance, the Hazelcast IMDG Distributed Query API can be used immediately to begin to make the data available in paged result sets to other enterprise systems. It’s easy to create graphically rich dashboards to visualize and display edge data, as seen in the Hazelcast Edge-to-Cloud Connected Vehicles demo.

If you’ve developed ML models to analyze device performance or to predict device failures, manufacturing yield percentages, etc., from your edge data, you can use the Hazelcast Jet Inference Runner to execute those models as part of a Hazelcast Jet pipeline—all within the Hazelcast Platform.

And you needn’t worry about having to do something poorly understood that no else has done before. Using the Hazelcast Platform to unlock device and sensor data that has been transmitted back to the cloud or data center is a well-tread path. Airbus, for example, uses Hazelcast to enable access to space science research data collected from sensors on sounding rockets.

Better Together: Hazelcast Industry Partnerships

Hazelcast’s evolving partnerships with industry leaders like Intel and IBM provide further peace-of-mind that edge-to-cloud transformation projects built atop the Hazelcast Platform will succeed. Hazelcast was a featured partner during the launch of the IBM Edge Application Manager and forms a key part of the IBM Edge Partner Ecosystem. For the cloud and data center components of edge-to-cloud solutions, Hazelcast’s Project Veyron partnership with Intel aims to enable higher in-memory storage densities and greater compute performance—including the execution of AI/ML workloads—on the latest generation of Intel hardware.

Keep Reading

Upcoming Webinar

Hazelcast Office Hours: Using the Hazelcast Platform to implement gRPC based microservices

Apr 24, 2024 / 8:00am PDT / 11:00am EDT / 3:00pm GMT

Blog

Announcing Hazelcast Platform 5.4 Release

Introduction The impact of solutions built on the Hazelcast Platform is visible in many aspects of our daily lives. It…

Upcoming Webinar

Embracing the demands of an AI-Centric future with Hazelcast Platform 5.4

May 9, 2024 / 10:00am PDT / 1:00pm EDT / 5:00pm GMT

Unlock AI’s future with Hazelcast! Join our webinar on May 9, 2024, to explore how Hazelcast Platform 5.4 transforms AI workloads. Learn to manage data with accuracy & innovate while reducing costs. Register now!

Blog

3 Techniques to Boost Event-Driven Microservices Architectures

In the ever-changing world of software development, the event-driven microservices architecture has emerged as a game-changer for its ability to…

Case Study

PSA Antwerp Cuts Operational Costs by 33% by Optimizing Their Business in Real Time

Webinar

/ Video

/ 60 min

Modernizing Payment Processing Architectures

In this webinar explore how payment processing architectures are adapting to meet customer demands and regulatory standards, and learn why businesses using outdated platforms risk losing market share.

Why Hazelcast?

Forrester names Hazelcast as a Strong Performer

Platform

Introducing Hazelcast Platform 5.4

Solutions

By Industry

By Use Case

By Architecture

Join us for a deep dive into Hazelcast Platform's capabilities

Resource Center

Learn

The Gartner®️ Market Guide for Event Stream Processing

Developers

Community

Learn

Toolbox

By Lucas Beeler

Distributed Systems Covering Edge-to-Cloud (Part 2)

Inside the Gateway: Edge-to-Cloud Stage 1

Inside the Gateway: Data Capture

Inside the Gateway: Aggregation and Canonicalization

From Edge-to-Cloud: Data Transport

In the Cloud (or Data Center): Making Sense of Information

Better Together: Hazelcast Industry Partnerships

Keep Reading

Hazelcast Office Hours: Using the Hazelcast Platform to implement gRPC based microservices

Announcing Hazelcast Platform 5.4 Release

Embracing the demands of an AI-Centric future with Hazelcast Platform 5.4

3 Techniques to Boost Event-Driven Microservices Architectures

PSA Antwerp Cuts Operational Costs by 33% by Optimizing Their Business in Real Time

Modernizing Payment Processing Architectures

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect

Why Hazelcast?

Forrester names Hazelcast as a Strong Performer

Platform

Introducing Hazelcast Platform 5.4

Solutions

By Industry

By Use Case

By Architecture

Join us for a deep dive into Hazelcast Platform's capabilities

Resource Center

Learn

The Gartner®️ Market Guide for Event Stream Processing

Developers

Community

Learn

Toolbox

By Lucas Beeler

Spread the Word

Distributed Systems Covering Edge-to-Cloud (Part 2)

Inside the Gateway: Edge-to-Cloud Stage 1

Inside the Gateway: Data Capture

Inside the Gateway: Aggregation and Canonicalization

From Edge-to-Cloud: Data Transport

In the Cloud (or Data Center): Making Sense of Information

Better Together: Hazelcast Industry Partnerships

Keep Reading

Hazelcast Office Hours: Using the Hazelcast Platform to implement gRPC based microservices

Announcing Hazelcast Platform 5.4 Release

Embracing the demands of an AI-Centric future with Hazelcast Platform 5.4

3 Techniques to Boost Event-Driven Microservices Architectures

PSA Antwerp Cuts Operational Costs by 33% by Optimizing Their Business in Real Time

Modernizing Payment Processing Architectures

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect