Kryo Serializer
A week ago I was called in to help a large online webshop with a problem. They are using Hazelcast as a large cache, since with Hazelcast the data can be distributed over multiple machines and with a database this is a lot more complicated. The problem was that they could not keep enough products in memory and the consequence was that the users were suffering from high latencies.
After inspecting their code and JVM settings I found a few improvements and I’ll discuss one of them in this blog post. Their main data-structure that is kept in memory is the product, which is stored in an IMap. This map is connected to the database using a MapLoader, which loads missing products into memory.
The product is quite a deep object graph with relevant information about the product. When a product is put in an IMap, it is serialized and the byte-array is stored in the map entry. They relied on standard Java serialization to serialize the product, but Java serialization doesn’t result in small byte-arrays.
So I switched to Kryo to do the actual serialization. Kryo also supports compression, to reduce the size of the byte-array even more. So I made a Kryo product serializer with configurable compression setting:
public class ProductKryoSerializer implements StreamSerializer<Product> { private final boolean compress; private static final ThreadLocal<Kryo> kryoThreadLocal = new ThreadLocal<Kryo>() { @Override protected Kryo initialValue() { Kryo kryo = new Kryo(); kryo.register(Product.class); return kryo; } }; public ProductKryoSerializer(boolean compress) { this.compress = compress; } public int getTypeId() { return 2; } public void write(ObjectDataOutput objectDataOutput, Product product) throws IOException { Kryo kryo = kryoThreadLocal.get(); if (compress) { ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(16384); DeflaterOutputStream deflaterOutputStream = new DeflaterOutputStream(byteArrayOutputStream); Output output = new Output(deflaterOutputStream); kryo.writeObject(output, product); output.close(); byte[] bytes = byteArrayOutputStream.toByteArray(); objectDataOutput.write(bytes); } else { Output output = new Output((OutputStream) objectDataOutput); kryo.writeObject(output, product); output.flush(); } } public Product read(ObjectDataInput objectDataInput) throws IOException { InputStream in = (InputStream) objectDataInput; if (compress) { in = new InflaterInputStream(in); } Input input = new Input(in); Kryo kryo = kryoThreadLocal.get(); return kryo.readObject(input, Product.class); } public void destroy() { } }
The Kryo instance is not thread safe, and quite expensive to build, so storing it on a ThreadLocal is a recommended way to make sure that the ProductKryoSerializer is thread safe.
The ProductKryoSerializer can be configured in Hazelcast 3.0 like this:
boolean compress = ...; Config config = new Config(); SerializerConfig productSerializer = new SerializerConfig() .setTypeClass(Product.class) .setImplementation(new ProductKryoSerializer(compress)); config.getSerializationConfig().addSerializerConfig(productSerializer); HazelcastInstance hz = Hazelcast.newHazelcastInstance(config);
So even though the original Product class isn’t changed, switching to a different serialization mechanism is quite easy, since no code change needs to be made.
I received a set of serialized products to verify how much the product byte-array can be compressed. With compression disabled:
Java Serialization (bytes) | Kryo Uncompression (bytes) | % compression |
---|---|---|
12121 | 4263 | 65 |
14361 | 5569 | 61 |
14361 | 5179 | 64 |
19167 | 9800 | 49 |
13559 | 4572 | 66 |
On average a 61 % reduction in size.
And with compression enabled:
Java Serialization (bytes) | Kryo Compressed (bytes) | % compression |
---|---|---|
12121 | 2165 | 82 |
14361 | 3025 | 79 |
14361 | 2755 | 81 |
19167 | 5063 | 74 |
13559 | 2453 | 82 |
On average an 80 % reduction in size.
As you can see, switching to Kryo resulted in much smaller byte arrays, so therefore in the same amount of memory, one keep much more map entries. So if you get the chance, try switching to Kryo, or one of the many other serialization libraries, to see how much your application can benefit from it. Also make sure that you have some performance tests in place to see how changing the serialization influences performance.