Interview with Hazelcast CTO Talip Ozturk: About Hazelcast 3.0
By Hazelcast Intern Mario Khosla
Hazelcast 3.0
Who cares about the new version of Hazelcast? I do. You should, too! What is a Hazelcast? You may be wondering to yourself right now, but fret not, as Talip Ozturk, the founder and CTO, will be answering this, among many other pressing questions buzzing through your head. As a brief roadmap, Talip will address what his software and company are all about, the previous versions of Hazelcast and what has changed in Hazelcast 3.0.
What is Hazelcast?
Talip: Hazelcast is an In-Memory NoSQL solution that enables you to scale your application with the help of scalable data structures and processing units, such as MapReduce. Because it can also be embedded, you don’t have to run extra services to scale the data.
Hazelcast 1.0? What did it look like?
The very first idea was that we take the well-known Java interfaces, and convert various services, such as queues and maps, into distributed versions. That was the initial goal. We would essentially be using the well-known interfaces, to make things easier for the users. There would be no new API to learn. I believe the website originally said, “Hazelcast: Distributed Data Structures”.
How did you get the idea for Hazelcast?
Around 2007, there were similar solutions out there, but they were either very expensive or difficult to use. Also, a lot of them didn’t perform well or scale well, so naturally there were many limitations. I started thinking about what the ideal product should be like. I thought it should be open source, since I wanted it to be available for everyone. I also noticed that people don’t like configurations, so ideally, it would be almost configuration free, with no dependencies. There would be a two-minute evaluation requirement, and then you download, run, and start a four-node cluster.
What makes Hazelcast 3.0 different from its predecessors? What has improved or stayed the same? How does the old architecture compare with the new design? Why are the changes important?
Versions 1 and 2 used the same kind of internal architecture, which was single threaded and had no separation of components. It was one whole thing that didn’t have any modules, abstractions, or any components you can replace. It was also so hard to add any new features without breaking things. Some other technical limitations were not being able to run multiple threads on a map. This didn’t scale as well because you were not able to use the full CPU or power of the machine. We also wanted to support other languages like C++ and C#. We had the vision of making everything modules, with several abstraction layers separating things. The biggest thing we did right in version 3.0 was separating partitioning from everything else. Now, you can have a plain Hazelcast with no services like map or queue, but you can still have a partitioning system that manages the migrations of the partitions. Because the software is now purely separated, you can personalize how you interact with the whole system. Your Map can be different tomorrow. You can create a hierarchical map service that is no longer one giant thing where everything is connected.
What are the benefits of these changes?
Now we’re able to add any number of services, and we can support any number of languages. Also, the code is more testable and customizable.
Why were the changes necessary?
Our clients themselves were asking for C++ clients, multi threading, and continuous queries, just to name a few. Changing the architecture meant rewriting 70% of the code. It also meant a whole new set of bugs and annoyances that we didn’t think of before.
Any plans for Hazelcast 4.0?
In the future, we should have persistence be a huge part of Hazelcast. We also want to make it more manageable with upgrading the code and the Hazelcast version itself.
In which direction is your company headed? Any other products?
For the short term, we’re working on Hazelcast only. There is still room for improvement. Among NoSQL databases, In-Memory databases and processing engines, Hazelcast touches a lot of stuff already. It has streaming features, but it isn’t a stream processing solution. It has potential to fix these use cases. It is possible to have a separate product for that, too. Making the partitioning system abstract is already a huge move in the right direction for us, because then you can build any partitionable system on top.
To summarize this interview with Talip, Hazelcast 3.0 makes a lot of steps in the right direction. Now, more of the architecture is abstracted and modularized. Additionally, Hazelcast 3.0 incorporates multi-threading, allowing more access to a machine’s full power. Some other updates in Hazelcast 3.0 include EntryProcessor (a function allowing faster in-memory operations on a Map, without having to worry about concurrency issues or locks), Lazy Indexing (indexes can be added to entries at any point), and continuous queries, just to name a few. Neat, huh?