Glossary Terms›Distributed Hash Table

What Is a Distributed Hash Table?

A distributed hash table is a decentralized data store that looks up data based on key-value pairs. Every node in a distributed hash table is responsible for a set of keys and their associated values. The key is a unique identifier for its associated data value, created by running the value through a hashing function. The data values can be any form of data.

Click to watch: HazelVision Episode 03: Breaking Up is Hard to Do – Partitioned Data Structures in Hazelcast

Distributed hash tables are decentralized, so all nodes form the collective system without any centralized coordination. They are generally fault-tolerant because data is replicated across multiple nodes. Distributed hash tables can scale for large volumes of data across many nodes.

Why Is a Distributed Hash Table Used?

Distributed hash tables provide an easy way to find information in a large collection of data because all keys are in a consistent format, and the entire set of keys can be partitioned in a way that allows fast identification on where the key/value pair resides. The nodes participating in a distributed hash table act as peers to find specific data values, as each node stores the key partitioning scheme so that if it receives a request to access a given key, it can quickly map the key to the node that stores the data. It then sends the request to that node.

Also, nodes in a distributed hash table can be easily added or removed without forcing a significant amount of re-balancing of the data in the cluster. Cluster rebalancing, especially for large data sets, can often be a time-consuming task that also impacts performance. Having a quick and easy means for growing or shrinking a cluster ensures that changes in data size does not disrupt the operation of the applications that access data in the distributed hash table.

Back to Glossary Terms

Keep Reading

Webinar

/ Video

/ 60 min

Tech Talk: Machine Learning at Scale Using Distributed Stream Processing

In this talk, Marko will show one approach which allows you to write a low-latency, auto-parallelized and distributed stream processing pipeline in Java that seamlessly integrates with a data scientist’s work taken in almost unchanged form from their Python development environment. The talk includes a live demo using the command line and going through some Python and Java code snippets.

Webinar

/ Video

/ 60 min

Tech Talk: Distributed Snapshots

Having fault-tolerance can be a factor to choose a distributed system even if the expected load can be handled by a single machine – a distributed system can tolerate failures of its parts while a system running on a single machine cannot. How can a stream-processing engine guarantee an exactly-once semantics? Viliam will describe the Chandy-Lamport algorithm that can be used to consistently snapshot the global state of a distributed system. I’ll also describe its special simplified case that’s used in Jet.

Why Hazelcast?

Forrester names Hazelcast as a Strong Performer

Key Capabilities

Products

Tool Kit

Quick Links

Unlock your perfect plan with our flexible pricing

Key Solutions

By Industry

By Use Case

By Architecture

Solutions

By Industry

By Use Case

By Architecture

Join us for a deep dive into Hazelcast Platform's capabilities

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

What Is a Distributed Hash Table?

Why Is a Distributed Hash Table Used?

Keep Reading

Tech Talk: Machine Learning at Scale Using Distributed Stream Processing

Tech Talk: Distributed Snapshots

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect

Why Hazelcast?

Forrester names Hazelcast as a Strong Performer

Key Capabilities

Products

Tool Kit

Quick Links

Unlock your perfect plan with our flexible pricing

Key Solutions

By Industry

By Use Case

By Architecture

Solutions

By Industry

By Use Case

By Architecture

Join us for a deep dive into Hazelcast Platform's capabilities

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

What Is a Distributed Hash Table?

Why Is a Distributed Hash Table Used?

Related Topics

Spread the Word

Keep Reading

Tech Talk: Machine Learning at Scale Using Distributed Stream Processing

Tech Talk: Distributed Snapshots

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect