Blog ›Tech Talk Series

By Vladimir Schreiner

Vladimir is a product manager with an engineering background and deep expertise in stream processing and real-time data pipelines. Ten years of building internal software platforms and development infrastructure have made him passionate about new technologies and finding ways to simplify data processing. Vladimir co-authored two white papers on the topic: Understanding Stream Processing: Fast Processing of Infinite and Big Data, and A Reference Guide to Stream Processing. His tutorial video on stream processing and real-time data pipelines discusses the building blocks of a stream processing pipeline and demonstrates how developers can write a full-blown streaming pipeline in less than a hundred lines of Java code for a variety of applications. Vladimir is also a lecturer with the Czechitas Foundation, whose mission is to inspire women and girls to explore the world of information technology. Czechitas Foundation teaches coding in various programming languages, software testing, and data analysis.

View all blogs by the author

Apr 1, 2020

Back to Blog

Tech Talk Series

Tech conferences and meetups have been canceled or postponed across the world. To make the situation a little bit more pleasing for everybody who misses them, Hazelcast has started a series of virtual tech meetups.

Please join us on Thursdays, starting April 2nd. Always at 3:30pm CET/ 7:30am PDT / 10:30am EDT / 2:30pm GMT.

The list of topics:

Streaming in the world of legacy applications (Vladimir Schreiner)

Date: Thursday, April 2, 2020

Recording: https://youtu.be/LzuRPXUrQZA

A practical introduction to CDC (Change Data Capture). Architecture, trade-offs, tooling, and demos.

There are common themes when people describe their reasons for rearchitecting legacy business applications at a technical level: Speed & Scalability. At a business level: The need to gain new real-time insights. These legacy applications commonly center around some central datastore, such as a relational database. Moving away from this architecture requires massive migration effort. The costs and risks associated with such an effort can sometimes be prohibitive for business owners, you can’t just rip out your relational database.

A lower risk, gradual transition to a target architecture, often wins the day. Streaming, Caching & CDC technologies are vital tools for this journey. CDC (Change Data Capture) can turn your legacy data stores into streaming sources. Modern caching technologies can host data in a way that provides speed and scalability, and finally, streaming acts as the glue that can drive new use cases as well as bridging the old.

Machine Learning at Scale using distributed stream processing (Marko Topolnik)

Date: Thursday, April 9, 2020

Recording: https://youtu.be/acDl6_c44ro

The capabilities of machine learning are now pretty well understood, and there are great tools to do data science and construct models that answer nontrivial questions about your data. These tools are mostly used in Python.

The key new challenge is making the trained prediction model usable in real time, while the user is interacting with your software. Getting answers from an ML model (this is called inference) takes a lot of CPU and must be done at serious scale. The ML tools are optimized mainly for batch-processing a lot of data at once, and often the implementations aren’t parallelized.

In this talk, I will show an approach that allows you to write a low-latency, auto-parallelized, and distributed stream processing pipeline in Java that seamlessly integrates with a data scientist’s work taken in almost unchanged form from their Python development environment.

The talk includes a live demo using the command line and going through some Python and Java code snippets.

3 Easy Improvements in Your Microservices Architecture (Nicolas Frankel)

Date: Thursday, April 16, 2020

Recording: https://youtu.be/snR2JpTTX4I

While a microservices architecture is more scalable than a monolith, it has a direct hit on performance.

To cope with that, one performance improvement is to set up a cache. It can be configured for database access, for REST calls or just to store session state across a cluster of server nodes. In this demo-based talk, I’ll show how Hazelcast In-Memory Data Grid can help you in each one of those areas and how to configure it. Hint: it’s much easier than one would expect.

Distributed Snapshots (Viliam Ďurina)

Date: Thursday, April 23, 2020

Recording: https://youtu.be/z5XspIKOI4I

Having fault-tolerance can be a factor in choosing a distributed system even if a single machine can handle the expected load – a distributed system can tolerate failures of its parts while a system running on a single machine cannot. How can a stream-processing engine guarantee an exactly-once semantics?

I’ll describe the Chandy-Lamport algorithm that can be used to snapshot the global state of a distributed system consistently. I’ll also describe its particular simplified case that’s used in Jet.

Advanced Kubernetes: Lesson Learned From Building a Managed Service (Hüseyin BABAL)

Date: Thursday, May 7, 2020

Recording: https://youtu.be/qPPe7O5KvI8

In this session, I will mention how to create a multi-tenant environment on Kubernetes to build a managed service.
I will provide golden rules of building managed service on top of Kubernetes with real-life examples as I gained experience during Hazelcast Cloud development:

Environment isolation
Microservice Architecture
Monitoring
Logging
Tracing

Embedded Time Series Storage: A Cookbook (Andrey Pechkurov)

Date: Thursday, May 21, 2020

Recording: https://youtu.be/1fzae–iHYU

Recently Hazelcast Management Center team had to build an embedded Java time series storage on top of existing well-known components. In this (very) practical talk we are going to discuss technical challenges and design decisions made during the process. The talk should be helpful for those who want to learn more about time series storages and databases.

Keep Reading

Blog

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Network hops cost milliseconds; milliseconds cost money. Put data, compute, and AI on one platform, and both bills shrink, whether…

Blog

Understanding the Value of Distributed Compute

Introduction Hazelcast is a powerful platform. It delivers the power of a highly reliable, distributed cache. Equally important is the…

Blog

Resilience That Holds Under Load: Hazelcast Platform 5.7

A major release for institutions where the operational state must remain correct during degradation, not just be restored afterward. The…

Blog

Testing distributed resilient applications powered by Hazelcast

Applications powered by Hazelcast and that use it to drive business logic need tests that go beyond happy-path validation. Serialization,…

Datasheet

/ PDF

/ 2 pages

Resilient, Continuous, Active Data – without Compromise Datasheet

The unified in-memory and stream processing platform for resilient, continuous active data at sub-millisecond speed.

Webinar

/ Video

/ 45 min

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Zero-downtime upgrades aren’t the hard part—schema evolution is. Learn how mixed service versions interact with shared cached data, why subtle inconsistencies cause failures, and how to design forward-compatible changes using Hazelcast and real Java examples.

Platform

Cloud Deployment Options

Key Solutions

By Industry

By Use Case

By Architecture

A cloud-agnostic architecture for your applications

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

A cloud-agnostic architecture for your applications

By Vladimir Schreiner

Tech Talk Series

Streaming in the world of legacy applications (Vladimir Schreiner)

Machine Learning at Scale using distributed stream processing (Marko Topolnik)

3 Easy Improvements in Your Microservices Architecture (Nicolas Frankel)

Distributed Snapshots (Viliam Ďurina)

Advanced Kubernetes: Lesson Learned From Building a Managed Service (Hüseyin BABAL)

Embedded Time Series Storage: A Cookbook (Andrey Pechkurov)

Keep Reading

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Understanding the Value of Distributed Compute

Resilience That Holds Under Load: Hazelcast Platform 5.7

Testing distributed resilient applications powered by Hazelcast

Resilient, Continuous, Active Data – without Compromise Datasheet

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect

Platform

Cloud Deployment Options

Key Solutions

By Industry

By Use Case

By Architecture

A cloud-agnostic architecture for your applications

Resource Center

Content Types

Learn

33% Reduction in Operational Costs

Developers

Community

Learn

Toolbox

A cloud-agnostic architecture for your applications

By Vladimir Schreiner

Spread the Word

Tech Talk Series

Streaming in the world of legacy applications (Vladimir Schreiner)

Machine Learning at Scale using distributed stream processing (Marko Topolnik)

3 Easy Improvements in Your Microservices Architecture (Nicolas Frankel)

Distributed Snapshots (Viliam Ďurina)

Advanced Kubernetes: Lesson Learned From Building a Managed Service (Hüseyin BABAL)

Embedded Time Series Storage: A Cookbook (Andrey Pechkurov)

Keep Reading

Decisions at the Speed of Memory: Hazelcast on IBM® LinuxONE 5

Understanding the Value of Distributed Compute

Resilience That Holds Under Load: Hazelcast Platform 5.7

Testing distributed resilient applications powered by Hazelcast

Resilient, Continuous, Active Data – without Compromise Datasheet

Zero Downtime, Real Pain: Schema Evolution in Cached, Live Systems

Why Hazelcast

About Us

Platform

Solutions

Developers

Learn

Connect