Open Source

In-Memory Computing and Stream Processing

In-Memory Computing

Store your data in RAM, spread and replicate it across a cluster of machines, and perform data-local computation on it. Replication gives you resilience to failures of cluster nodes.

Hazelcast is an open-source distributed In-memory object store supporting a wide variety of data structures such as Map, Set, List, MultiMap, RingBuffer, HyperLogLog. Cloud and Kubernetes friendly.

Get StartedLearn MoreDownload

Stream Processing

Build data pipelines processing streams of events such as from message queues and database changelogs. The processing state is replicated, allowing you to scale up and down the computation without any loss of data.

Hazelcast enables open-source distributed stream and batch processing with embedded in-memory storage and a variety of connectors such as Kafka, Amazon S3, Hadoop, JMS and JDBC.

Get StartedLearn MoreDownload

Hazelcast Platform 5.3

Unified Real-Time Data Platform

// Start the Embedded Hazelcast Cluster Member.
HazelcastInstance hz = Hazelcast.newHazelcastInstance();
 // Get the Distributed Map from Cluster.
IMap map = hz.getMap("my-distributed-map");
//Standard Put and Get.
map.put("key", "value");
map.get("key");
//Concurrent Map methods, optimistic updating
map.putIfAbsent("somekey", "somevalue");
map.replace("key", "value", "newvalue");
// Shutdown the Hazelcast Cluster Member
hz.shutdown();

Distributed Map

Distributed Map is the most widely use data structure within Hazelcast. You can store objects directly from your application and get them back using the key or via SQL like queries. Everything is stored in memory with replicas spread across the cluster. Adding cluster members expands the space available for data. The example you can see here is in Java, but the API is similar across other languages. The Distributed Map can also recognise JSON values and allows querying on its elements.

Stream Processing

Hazelcast can apply continuous transforms to a stream of data, such as filtering, mapping or aggregation over windows or joining multiple data sources. It can deal with events arriving out of order or can be used for detecting patterns in an event stream. Jet supports many different data sources and sinks, such as Apache Kafka, message brokers, relational databases, Amazon S3, Hadoop and its own built in distributed map structure.

IMap<String,Double> averagePrices = jet.getMap("current-avg-trade-price");
 
Pipeline p = Pipeline.create();
// Stream (trade symbol, price) records from Kafka.
p.readFrom(KafkaSources.kafka("trades", props))
 .withTimestamps(Trade::getTime, 0L)
 .filter(trade -> STOCKS.contains(trade.getSymbol()))
 .groupingKey(Trade::getSymbol)
 // 10 second sliding window, updated every 100ms
 .window(WindowDefinition.sliding(10_000, 100))
 .aggregate(AggregateOperations.averagingLong(Trade::getPrice)) 
 // write results to a distributed map.
 .map(window -> Util.entry(window.getKey(), window.getValue()))
 .writeTo(Sinks.map(averagePrices));
<map name="customers">
    <backup-count>1</backup-count>
    <eviction eviction-policy="NONE" 
              max-size-policy="PER_NODE" size="0"/>
    <map-store enabled="true" initial-mode="LAZY">
        <class-name>com.examples.DummyStore</class-name>
        <write-delay-seconds>60</write-delay-seconds>
        <write-batch-size>1000</write-batch-size>
        <write-coalescing>true</write-coalescing>
        <properties>
           <property name="jdbc_url">my.jdbc.com</property>
        </properties>
    </map-store>
</map-store>

Database Caching

Use Hazelcast IMDG to speed-up applications that read and write to disks, such as relational databases and NoSQL stores. Hazelcast IMDG supports several cache patterns such as Read-Through, Write-Through, Write-Behind & Cache-Aside. Using the first three patterns the application need not know anything about the backing stores, they just deal with data structure APIs such as Map. Write-Behind solves the problem of slow data stores where the application would usually wait for an acknowledgment. The following example shows cluster configuration for a Write-Behind store.

Distributed Compute

Use Hazelcast to speed up your MapReduce, Spark, or custom Java data processing jobs. Load data sets to a cluster cache and perform compute jobs on top of the cached data. You get significant performance gains by combining an in-memory approach and co-location of jobs and data with parallel execution.

IMap<Integer, Product> productMap = jet.getMap(PRODUCTS);
Pipeline p = Pipeline.create();
// read a list of trades, join with the product map and then write back to files
p.readFrom(Sources.files("trades"))
 .mapUsingIMap(productMap, 
       trade -> trade.productId(), 
       (t, product) -> tuple2(t, product.name())
 ).writeTo(Sinks.files("joined"));

Hazelcast Guides are in!

10-15 min bite-sized tutorials.

Why Hazelcast?

Build Distributed Applications

Hazelcast provides tools for building distributed applications. Use Hazelcast for distributed coordination and in-memory data storage and Hazelcast Jet for building streaming data pipelines. Using Hazelcast allows developers to focus on solving problems rather than data plumbing.

Create a Cluster within Seconds

It’s easy to get started with Hazelcast. The nodes automatically discover each other to form a cluster, both in a cloud environment and on your laptop. This is great for quick testing and simplifies deployment and maintenance. No additional dependencies.

Store Data In-Memory Resiliently

Hazelcast automatically partitions and replicates data in the cluster and tolerates node failures. You can add new nodes to increase storage capacity immediately. You can use it as a cache or to store transactional state and perform data-local computations or queries. As all data is stored in memory, you can access it in sub-millisecond latencies. Clients for Java, Python, .NET, C++ and Go are available.

Get Started with In-Memory Computing

Build Fault-Tolerant Data Pipelines

Use Hazelcast to build massively parallel data pipelines. You can process data using a rich library of transforms such as windowing, joins and aggregations. Hazelcast keeps processing data without loss even when a node fails, and as soon as you add another node, it starts sharing the computation load. First-class support for Apache Kafka, Hadoop and many other data sources and sinks.

Get Started with Stream Processing

Easy Distributed Coordination

Hazelcast has a full implementation of Raft, allowing a simple API for building linearizable distributed systems. Use tools like FencedLock, Semaphore and AtomicReference to simplify coordination between distributed applications.

Who is using Hazelcast?

Apache Drill

Apache Drill

Apache TomEE

Apache TomEE

Atlassian

Atlassian

J Hipster

J Hipster

MuleSoft

MuleSoft

Payara

Payara

Sonatype

Sonatype

Vert.x

Vert.x

Pitney Bowes

Pitney Bowes

Reserve Bank of Australia

Reserve Bank of Australia