Big Data Processing with Hazelcast IMDG & Apache Spark

Hazelcast IMDG can accelerate Big Data processing in Apache Spark. Modern Java and JVM (Java Virtual Machine) applications have evolved so that in many cases they require data storage and compute capabilities that go beyond the limitations imposed by a single JVM.

New technologies are needed, that can distribute processing and storage across machines whilst maintaining a flexible and convenient programming interface for developers.

Two key technologies to help teams build applications for this new era are:

  • Hazelcast IMDG – the open source data grid designed to provide a simple “on ramp” for Java developers to scalable, production grade distributed caching and processing
  • Apache Spark – emerging as a leading “Big Data” technology with fast and powerful analysis capabilities and a convenient API

Hazelcast IMDG is well-known for its interoperability capabilities and is integrated with dozens of other software technologies, including Spark. Hazelcast has clients for several programing languages (currently Java, C#/.Net, C/C++, Python, Node.js and Scala), while Spark supports Java, Scala, Python and R out of the box.

This means the combination of Hazelcast IMDG and Spark can easily be used across stacks that comprise multiple languages.

In the emerging world of large-scale applications and increased processing and data volumes, it has never been more important to choose an open-source architecture that can rise to the new challenges. With production grade engineering, high performance as standard, and vibrant open source communities backed with enterprise support, Hazelcast IMDG and Apache Spark is a compelling combination.

Products in this Use Case:

Hazelcast IMDG

Learn More