Tech Talk Series

Tech conferences and meetups have been canceled or postponed across the world. To make the situation a little bit more pleasing for everybody who misses them, Hazelcast has started a series of virtual tech meetups.

Please join us on Thursdays, starting April 2nd. Always at 3:30pm CET/ 7:30am PDT / 10:30am EDT / 2:30pm GMT.

The list of topics:

Streaming in the world of legacy applications (Vladimir Schreiner)

Date: Thursday, April 2, 2020

Recording: https://youtu.be/LzuRPXUrQZA

A practical introduction to CDC (Change Data Capture). Architecture, trade-offs, tooling, and demos.

There are common themes when people describe their reasons for rearchitecting legacy business applications at a technical level: Speed & Scalability. At a business level: The need to gain new real-time insights. These legacy applications commonly center around some central datastore, such as a relational database. Moving away from this architecture requires massive migration effort. The costs and risks associated with such an effort can sometimes be prohibitive for business owners, you can’t just rip out your relational database.    

A lower risk, gradual transition to a target architecture, often wins the day. Streaming, Caching & CDC technologies are vital tools for this journey. CDC (Change Data Capture) can turn your legacy data stores into streaming sources. Modern caching technologies can host data in a way that provides speed and scalability, and finally, streaming acts as the glue that can drive new use cases as well as bridging the old.    

Machine Learning at Scale using distributed stream processing (Marko Topolnik)

Date: Thursday, April 9, 2020

Recording: https://youtu.be/acDl6_c44ro

The capabilities of machine learning are now pretty well understood, and there are great tools to do data science and construct models that answer nontrivial questions about your data. These tools are mostly used in Python.

The key new challenge is making the trained prediction model usable in real time, while the user is interacting with your software. Getting answers from an ML model (this is called inference) takes a lot of CPU and must be done at serious scale. The ML tools are optimized mainly for batch-processing a lot of data at once, and often the implementations aren’t parallelized.

In this talk, I will show an approach that allows you to write a low-latency, auto-parallelized, and distributed stream processing pipeline in Java that seamlessly integrates with a data scientist’s work taken in almost unchanged form from their Python development environment.

The talk includes a live demo using the command line and going through some Python and Java code snippets.

3 Easy Improvements in Your Microservices Architecture (Nicolas Frankel)

Date: Thursday, April 16, 2020

Recording: https://youtu.be/snR2JpTTX4I

While a microservices architecture is more scalable than a monolith, it has a direct hit on performance.

To cope with that, one performance improvement is to set up a cache. It can be configured for database access, for REST calls or just to store session state across a cluster of server nodes. In this demo-based talk, I’ll show how Hazelcast In-Memory Data Grid can help you in each one of those areas and how to configure it. Hint: it’s much easier than one would expect.

Distributed Snapshots (Viliam Ďurina)

Date: Thursday, April 23, 2020

Recording: https://youtu.be/z5XspIKOI4I

Having fault-tolerance can be a factor in choosing a distributed system even if a single machine can handle the expected load – a distributed system can tolerate failures of its parts while a system running on a single machine cannot. How can a stream-processing engine guarantee an exactly-once semantics? 

I’ll describe the Chandy-Lamport algorithm that can be used to snapshot the global state of a distributed system consistently. I’ll also describe its particular simplified case that’s used in Jet.

Advanced Kubernetes: Lesson Learned From Building a Managed Service (Hüseyin BABAL)

Date: Thursday, May 7, 2020

Recording: https://youtu.be/qPPe7O5KvI8

In this session, I will mention how to create a multi-tenant environment on Kubernetes to build a managed service.
I will provide golden rules of building managed service on top of Kubernetes with real-life examples as I gained experience during Hazelcast Cloud development:

  • Environment isolation
  • Microservice Architecture
  • Monitoring
  • Logging
  • Tracing

Embedded Time Series Storage: A Cookbook (Andrey Pechkurov)

Date: Thursday, May 21, 2020

Recording: https://youtu.be/1fzae–iHYU

Recently Hazelcast Management Center team had to build an embedded Java time series storage on top of existing well-known components. In this (very) practical talk we are going to discuss technical challenges and design decisions made during the process. The talk should be helpful for those who want to learn more about time series storages and databases.