Open Source Projects:

Ultra-Fast Stream Processing Framework

Hazelcast Jet

Lightweight, embeddable, powerful.

Back to top

Make faster and smarter business decisions by leveraging event streaming
data in real-time 

The Hazelcast Jet Stream Processing Framework.

Hazelcast Jet is an application embeddable, stream processing engine designed for fast processing of big data sets. The Hazelcast Jet architecture is high performance and low-latency-driven, based on a parallel, streaming core engine that enables data-intensive applications to operate at near real-time speeds. Jet is used to develop stream or batch processing applications using directed acyclic graph (DAG) at its core.

Jet supports Apache Beam, a portable API layer for building sophisticated parallel processing pipelines. Use Jet as an execution engine (“runner”) to get the highest throughput and lowest latency Beam pipelines. Read the Beam documentation for Jet, and the Hazelcast blog on running Jet with Beam.

For more information:

The Use of Directed Acyclic Graphs (DAGs)
in Jet

Why are DAGs relevant to Jet processing? Jet uses DAGs to represent the execution plan. When running jobs, Jet identifies the available resources (i.e., CPU cores) and deploys tasks across the cluster to achieve the highest levels of performance. Jet simplifies this process for you by determining the optimal configuration of tasks and deploying the tasks for you, freeing you from the otherwise complicated effort of identifying where to run tasks and at what levels.

To accomplish this, Jet models its stream processing pipelines as DAGs, which lay out the tasks of a job and how they interact. It then optimizes the DAG to leverage parallelism for performance and efficiency of jobs. In the DAG representation, vertices represent the computational steps, and edges represent the data flows. Each vertex receives data from its inbound edges, performs a step in the computation, then emits data to its outbound edges. The breakdown of a job into separate vertices is possible thanks to  data partitioning, so that subtasks in the overall job can be processed in parallel, independently of each other.

The key point here is that you do not have to worry about DAG planning. Jet does it all for you, so simply write your processing code using the Jet Pipeline API, and the Jet engine will take care of the runtime.

In the diagram below, you can see how tasks can be distributed across cores and nodes to run in parallel. The use of cooperative threads (i.e., application-level threads) lets Jet more efficiently run that parallelism without incurring the overhead of context switching in OS-level threads. Jet coordinates the threads to deliver superior performance with no planning required on the application developer.

Below is a visual representation of a DAG for a Jet job, as displayed in Management Center. You can see how a job is broken down into distinct tasks, and the processed data is delivered to separate sinks (i.e., destination repositories).

Architecture & Features

Hazelcast Jet Stream Processing Framework

Industry-Leading Performance Industry-Leading Performance

Built on a distributed computing platform, Hazelcast Jet offers a parallel, low-latency core engine for data-intensive applications to operate at real-time processing speeds, while cooperative multi-threading architecture enables operation of thousands of jobs simultaneously.

Flexible Integration Flexible Integration

Hazelcast Jet operates as shared application infrastructure or embedded directly in applications. It is ideal for microservices with a lightweight footprint, making data manipulation easy for developers and DevOps. Cloud and container ready.

High Productivity High Productivity

One solution provides stream, batch, and RPC processing, with a variety of connectors to enable easy integration into data processing pipelines. Scaling, failure handling and recovery are all automated for ease of operational management.

Enterprise-Grade Storage Enterprise-Grade Storage

Embed Hazelcast IMDG as an operational storage for enterprise-grade reliability and resilience, with high-performance integration that eliminates network transit latency and integration issues between processing and storage.

Oil and Gas

SigmaStream is deploying Hazelcast as a processing backbone to monitor and analyze high-frequency (50,000+ events per second) data streaming in from oil well sensors on drilling rigs operating thousands of feet beneath the North Sea (an extreme edge use case). This is an industry-leading application of streaming technology in the context of oil and gas drilling platforms and has both broad and deep applicability across the entire domain. Initial estimates indicate a 10% + cost savings in the time required to achieve true vertical depth.

Market Data Ingest

Hazelcast Jet uploads a stream of stock market pricing data from a Kafka topic into an IMDG map. Data is analyzed as part of the upload process, calculating the moving averages to detect buy/sell indicators.

Flight Telemetry

Hazelcast Jet reads a stream of telemetry data from ADS-B on all commercial aircraft flying anywhere in the world, typically 5,000 to 6,000 aircraft at any point in time. This data is filtered and aggregated, and then certain features are enriched and displayed in Grafana.

Get started with Hazelcast Jet

The world's most advanced stream processing framework.

Free Hazelcast Online Training Center

Whether you're interested in learning the basics of in-memory systems, or you're looking for advanced, real-world production examples and best practices, we've got you covered.