In-Memory Processing
In-memory processing is the practice of taking action on data entirely in computer memory (e.g., in RAM). This is in contrast to other techniques of processing data which rely on reading and writing data to and from slower media such as disk drives. In-memory processing typically implies large-scale environments where multiple computers are pooled together so their collective RAM can be used as a large and fast storage medium. Since the storage appears as one big, single allocation of RAM, large data sets can be processed all at once, versus processing data sets that only fit into the RAM of a single computer.
How Does In-Memory Processing Work?
In-memory processing works by eliminating all slow data accesses and relying exclusively on data stored in RAM. Overall processing performance is not slowed by the latency commonly seen when accessing hard disk drives or SSDs. Software running on one or more computers manages the processing work as well as the data in memory, and in the case of multiple computers, the software divides the processing into smaller tasks which are distributed out to each computer to run in parallel. In-memory processing is often done in the technology known as in-memory data grids (IMDG). On such example is Hazelcast IMDG, which lets users run complex data processing jobs on large data sets across a cluster of hardware servers while maintaining extreme speed.
In-memory processing is extremely popular today because of its huge performance advantage over processing techniques that require reading and writing to slower media. Reads and writes to slower media often result in a data access bottleneck (a phenomenon known as “I/O bound” where “I/O” refers to input/output). Because of its speed, in-memory processing is often described as real-time or at least applied to real-time use cases. And as RAM prices continue to fall, more in-memory processing opportunities arise as they become more economically compelling.
Example Use Cases
There are use cases where the in-memory performance advantage is essential and is absolutely necessary to meet business requirements. Use cases such as payment processing, fraud detection, predictive maintenance, algorithmic trading, self-driving cars, etc., all need high-speed processing. There are also a set of use cases that drive competitive advantage as a result of the speed increase. In other words, the high speed is not absolutely crucial but is necessary to get an edge. Use cases such as business intelligence dashboards, ad hoc data querying, data discovery, extract-transform-load (ETL) workloads, etc., all can be augmented with in-memory performance to drive significant competitive advantage.
In-memory processing is not limited to any specific technology. Example technologies include databases, query engines, and data grids. These technologies differ in their main focus; databases are about storing and retrieving data, query engines are about retrieving data from a variety of sources, and data grids are about running custom applications that perform actions on data. In-memory data grids (IMDG) like Hazelcast IMDG are especially useful for running many different types of processing in-memory to attain the highest speed possible.