This short video explains why companies use Hazelcast for business-critical applications based on ultra-fast in-memory and/or stream processing technologies.
Stream processing is a hot topic right now, especially for any organization looking to provide insights faster. But what does it mean for users of Java applications, microservices, and in-memory computing?
In this webinar, we will cover the evolution of stream processing and in-memory related to big data technologies and why it is the logical next step for in-memory processing projects.
Now, deploying Hazelcast-powered applications in a cloud-native way becomes even easier with the introduction of Hazelcast Cloud Enterprise, a fully-managed service built on the Enterprise edition of Hazelcast IMDG. Can't attend the live times? You should still register! We'll be sending out the recording after the webinar to all registrants.
Hazelcast provides the key components to build a real-time stream processing application. It is a powerful processing framework for querying data streams on top of an elastic in-memory storage system, where the process may ultimately store its results.
Hazelcast processing tasks, called jobs, are distributed across the cluster to parallelize the computations. You can elastically and horizontally scale the cluster based on your performance and volume requirements.
For real-time data enrichment, Hazelcast provides a tight integration with in-memory computing to deliver very high-speed data access. You can store large amounts of data in-memory, which are then joined to the data stream with microsecond latency. Moreover, you can reduce the end-to-end latency by using Hazelcast to store temporary data for stateful stream processing tasks.
Stream processing presents unique challenges that are not relevant to batch processing frameworks. Below are several key challenges and details on how Hazelcast addresses each of these.
Streaming data is fundamentally different from batch or micro-batch processing because both inputs and outputs are continuous. In many cases, streaming computations look at how values change over time. Typically, we look at streaming data in terms of “windows,” a specific slice of the data stream constrained to a time period. Hazelcast streaming processing supports tumbling, sliding, and session windows. For more information on windows, please read the documentation page on time windowing.
Hazelcast supports the notion of “event time” in which events can have their own timestamp and may arrive out of order. To handle out-of-order events and especially late-arriving events, stream processors must keep calculations (i.e., aggregations) open until all events have arrived. However, stream processors cannot know if all events have arrived, so they need to discard any extremely late events. To define what constitutes an “extremely late” event, Hazelcast sets a “watermark” that marks a time window in which late-arriving events can still be processed in the appropriate aggregation window. Events arriving from beyond the watermarked time window are discarded.
Fault tolerance in stream processing systems must deal with preserving data that is not necessarily stored in any permanent medium. This means that stream processors need to know how to handle failures with data-in-motion, or else data can be lost. Hazelcast is fault-tolerant for issues such as network failures, splits, and node failures. When there is a fault, Hazelcast uses the latest state snapshot and automatically restarts all jobs that contain the failed member as a job participant from this snapshot. With in-memory snapshots saved to distributed in-memory storage, Hazelcast resumes processing where it left off. Distributed in-memory storage is an integral component of the cluster. Multiple replicas of data are stored in a distributed manner across the cluster to increase the cluster’s resiliency.
Event processing systems have to balance tradeoffs in performance and correctness, and some systems may not allow firm processing guarantees, which can make it difficult to program these systems.
Hazelcast allows you to choose a processing guarantee when you start a job. While there is some performance tradeoff with the higher guarantees, Hazelcast still provides superior processing speed while honoring the chosen guarantee. Hazelcast provides exactly-once processing (the slowest but most correct), at-least-once processing, or no guarantee of correctness (the fastest option).
BNP Paribas Bank Polska increases revenue through real-time offers driven by specific customer needs.
Read this e-book from RTInsights to get an overview of stream processing, including the popular use cases and what to plan for.
This paper is intended for software architects and developers who are planning or building system utilizing stream processing, fast batch processing, data processing microservices or distributed java.util.stream.While quite simple and robust, the batching approach clearly introduces a large latency between gathering the data and being ready to act upon it. The goal of stream processing is to overcome this latency. It processes the live, raw data immediately as it arrives and meets the challenges of incremental processing, scalability and fault tolerance.
Whether you're interested in learning the basics of in-memory systems, or you're looking for advanced, real-world production examples and best practices, we've got you covered.