In-Memory Computing and Stream Processing
Store your data in RAM, spread and replicate it across a cluster of machines, and perform data-local computation on it. Replication gives you resilience to failures of cluster nodes.
Hazelcast is an open-source distributed In-memory object store supporting a wide variety of data structures such as Map, Set, List, MultiMap, RingBuffer, HyperLogLog. Cloud and Kubernetes friendly.
Get StartedLearn MoreDownload
Build data pipelines processing streams of events such as from message queues and database changelogs. The processing state is replicated, allowing you to scale up and down the computation without any loss of data.
Hazelcast enables open-source distributed stream and batch processing with embedded in-memory storage and a variety of connectors such as Kafka, Amazon S3, Hadoop, JMS and JDBC.
Get StartedLearn MoreDownload
// Start the Embedded Hazelcast Cluster Member. HazelcastInstance hz = Hazelcast.newHazelcastInstance(); // Get the Distributed Map from Cluster. IMap map = hz.getMap("my-distributed-map"); //Standard Put and Get. map.put("key", "value"); map.get("key"); //Concurrent Map methods, optimistic updating map.putIfAbsent("somekey", "somevalue"); map.replace("key", "value", "newvalue"); // Shutdown the Hazelcast Cluster Member hz.shutdown();
Distributed Map is the most widely use data structure within Hazelcast. You can store objects directly from your application and get them back using the key or via SQL like queries. Everything is stored in memory with replicas spread across the cluster. Adding cluster members expands the space available for data. The example you can see here is in Java, but the API is similar across other languages. The Distributed Map can also recognise JSON values and allows querying on its elements.
Hazelcast can apply continuous transforms to a stream of data, such as filtering, mapping or aggregation over windows or joining multiple data sources. It can deal with events arriving out of order or can be used for detecting patterns in an event stream. Jet supports many different data sources and sinks, such as Apache Kafka, message brokers, relational databases, Amazon S3, Hadoop and its own built in distributed map structure.
IMap<String,Double> averagePrices = jet.getMap("current-avg-trade-price"); Pipeline p = Pipeline.create(); // Stream (trade symbol, price) records from Kafka. p.readFrom(KafkaSources.kafka("trades", props)) .withTimestamps(Trade::getTime, 0L) .filter(trade -> STOCKS.contains(trade.getSymbol())) .groupingKey(Trade::getSymbol) // 10 second sliding window, updated every 100ms .window(WindowDefinition.sliding(10_000, 100)) .aggregate(AggregateOperations.averagingLong(Trade::getPrice)) // write results to a distributed map. .map(window -> Util.entry(window.getKey(), window.getValue())) .writeTo(Sinks.map(averagePrices));
<map name="customers"> <backup-count>1</backup-count> <eviction eviction-policy="NONE" max-size-policy="PER_NODE" size="0"/> <map-store enabled="true" initial-mode="LAZY"> <class-name>com.examples.DummyStore</class-name> <write-delay-seconds>60</write-delay-seconds> <write-batch-size>1000</write-batch-size> <write-coalescing>true</write-coalescing> <properties> <property name="jdbc_url">my.jdbc.com</property> </properties> </map-store> </map-store>
Use Hazelcast IMDG to speed-up applications that read and write to disks, such as relational databases and NoSQL stores. Hazelcast IMDG supports several cache patterns such as Read-Through, Write-Through, Write-Behind & Cache-Aside. Using the first three patterns the application need not know anything about the backing stores, they just deal with data structure APIs such as Map. Write-Behind solves the problem of slow data stores where the application would usually wait for an acknowledgment. The following example shows cluster configuration for a Write-Behind store.
Use Hazelcast to speed up your MapReduce, Spark, or custom Java data processing jobs. Load data sets to a cluster cache and perform compute jobs on top of the cached data. You get significant performance gains by combining an in-memory approach and co-location of jobs and data with parallel execution.
IMap<Integer, Product> productMap = jet.getMap(PRODUCTS); Pipeline p = Pipeline.create(); // read a list of trades, join with the product map and then write back to files p.readFrom(Sources.files("trades")) .mapUsingIMap(productMap, trade -> trade.productId(), (t, product) -> tuple2(t, product.name()) ).writeTo(Sinks.files("joined"));
Build Distributed Applications
Hazelcast provides tools for building distributed applications. Use Hazelcast for distributed coordination and in-memory data storage and Hazelcast Jet for building streaming data pipelines. Using Hazelcast allows developers to focus on solving problems rather than data plumbing.
Create a Cluster within Seconds
It’s easy to get started with Hazelcast. The nodes automatically discover each other to form a cluster, both in a cloud environment and on your laptop. This is great for quick testing and simplifies deployment and maintenance. No additional dependencies.
Store Data In-Memory Resiliently
Hazelcast automatically partitions and replicates data in the cluster and tolerates node failures. You can add new nodes to increase storage capacity immediately. You can use it as a cache or to store transactional state and perform data-local computations or queries. As all data is stored in memory, you can access it in sub-millisecond latencies. Clients for Java, Python, .NET, C++ and Go are available.
Build Fault-Tolerant Data Pipelines
Use Hazelcast to build massively parallel data pipelines. You can process data using a rich library of transforms such as windowing, joins and aggregations. Hazelcast keeps processing data without loss even when a node fails, and as soon as you add another node, it starts sharing the computation load. First-class support for Apache Kafka, Hadoop and many other data sources and sinks.
Easy Distributed Coordination
Hazelcast has a full implementation of Raft, allowing a simple API for building linearizable distributed systems. Use tools like FencedLock, Semaphore and AtomicReference to simplify coordination between distributed applications.
Hazelcast is a single Java archive (JAR) less than 15MB. It’s lightweight enough to run on small devices, you can embed it into your application as just another dependency or deploy it as a standalone cluster. First-class support for Kubernetes is included.
Who is using Hazelcast?
Reserve Bank of Australia
Compare Redis with Hazelcast
Redis and Hazelcast solve many similar use cases, most commonly caching. They are quite different in how they approach things such as cache patterns, clustering & querying.
Build Cloud-Native Microservices
Set up a Hazelcast cluster in Kubernetes, and make use of Hazelcast storage and messaging capabilities in your microservices architectures.