This short video explains why companies use Hazelcast for business-critical applications based on ultra-fast in-memory and/or stream processing technologies.
Stream processing is a hot topic right now, especially for any organization looking to provide insights faster. But what does it mean for users of Java applications, microservices, and in-memory computing? In this webinar, we will cover the evolution of stream processing and in-memory related to big data technologies and why it is the logical next step for in-memory processing projects.
Now, deploying Hazelcast-powered applications in a cloud-native way becomes even easier with the introduction of Hazelcast Cloud Enterprise, a fully-managed service built on the Enterprise edition of Hazelcast IMDG. Can't attend the live times? You should still register! We'll be sending out the recording after the webinar to all registrants.
Hazelcast Jet provides a powerful programming model that allows you to scale up your processing tasks to utilize the full resources of your hardware. Internally, the programs you write using the Jet APIs will be represented as a Directed Acyclic Graph (DAG). A DAG is a set of processing tasks (nodes) connected by data flows (edges). By representing different steps within a job as individual nodes, Jet can replicate these nodes across processor cores on one or multiple computer systems to scale up the workload and concurrently process events.
Although it’s possible to program directly to the DAG API, in most cases it is simpler and just as powerful to program to the easy-to-use Pipeline API. Here is a simple Pipeline that implements a word count algorithm. (The word count algorithm is the “Hello, World” of stream processing)
All pipelines have an input source (which we readFrom) and one or more output sinks (which we writeTo); between the read and write we have pipeline stages which are the processing steps we want to apply to items coming from the input source. In this case, we have a simple linear pipeline with just a single path – more complex DAGs that fork into multiple paths, join back together, and enrich the incoming data with supplemental data from other sources are common.
readFrom
writeTo
The above pipeline will be transformed into a DAG like the following for execution:
The Tokenize and Accumulate steps will be duplicated by Jet’s runtime engine, allowing multiple lines of text to be processed concurrently, and for the accumulation of totals for various words to similarly be processed in parallel.
One of the powerful features of Jet is the ability to interoperate with the sources and sinks that you are likely already using – such as Apache Kafka, JDBC, JMS, MongoDB, ElasticSearch, Debezium, Sockets and files, and of course Hazelcast IMDG. This ability to integrate with your existing software stack makes Jet a powerful ETL engine from bringing data into the systems that need it while performing filtering and transformation of the data along the way.
If you have a data source or sink for which there isn’t an existing connector, it’s a straightforward process to create your own using the SourceBuilder and SinkBuilder classes (see here).
Many of the use cases where Jet is being used today are in situations where events are being analyzed as they arrive, and business critical decisions are being made. In some cases, the processing is simple and straightforward, and can easily be coded as a Java method. Other cases, such as fraud detection or stock trade analysis, may require far more complex processing.
These sorts of complex analyses are ideal for Machine Learning. Jet is an ideal platform for operationalizing the machine learning models that have been developed and trained by data scientists. The role of Jet is not in developing the model or doing the initial training, but rather in allowing the model to be executed in a pipeline that operates with the high throughput and low latency that the business requires.
An example of executing a TensorFlow model from Hazelcast Jet can be found here.
Jet pipelines are coded in Java, which is familiar to many developers, but not the preferred language of data scientists. In the Jet 4.0 release, the ability to call functions implemented in Python was added. In Jet 4.1, this ability to call external functions was further generalized to support the gRPC procedure calling convention to support C++ and other languages.
Hazelcast Jet is an open core software product licensed under the Apache 2 license. The home of the open source core is here, and the project sources can be viewed or forked from GitHub.
Hazelcast Jet Enterprise adds features important to Enterprise deployments including the security suite, support for OpenShift deployments, and the ability to upgrade running jobs with no downtime or data loss.
Google Group
Stack Overflow
Gitter Chat
Whether you're interested in learning the basics of in-memory systems, or you're looking for advanced, real-world production examples and best practices, we've got you covered.