Kafka streams builds on stream building concepts and claims a low barrier to entry, this may be correct, but some of those concepts are trickier to get your head around than others. We will take a look at windows, one of the streaming fundamentals, and see what windowing means in kafka streams, how we can implement windows and how we can test them.
The source code for the Kafka examples can be found here.
If you’re new to kafka there are many articles on the core concepts on the Lydtech blog
And not forgetting The greatest introductory course to Kafka ever! - An Introduction To Kafka with Spring Boot
If we’re talking about windows it makes sense to define what we mean by a window from the outset with a simple definition. A window is an isolated timeframe for events. The duration and behaviour of that window can be specified for your needs, but the window allows you to process the events during that timeframe in relative isolation. The following diagram illustrates the relationship between time, events, and the window
Here we can see events A-E and a window that covers the timeframe from time t1 to just before time t3. Only the events B, C, & D, are within the window, so any processing can be isolated to just those events. The simplest example being a count of events within the window, in this case that would be a count of 3.
What an event means, and its relevance (if any) to other events within a window depends on the type of window selected and the aggregation applied. It could be that an event only ever appears in one window, or can appear in multiple windows.
Windowing requires events to be grouped prior to aggregation, it’s usually best to groupByKey. In these examples we are usually assuming that all the events have the same key, in reality, you could end with hundreds of keys. A different key results in a different window aggregation.
Kafka Streams offers four window types;
Before we kick off there are a couple of terms we should define as they pop up in most of the window explanations that follow.
Grace period: The grace period determines how long to wait for out-of-order records. Any record received outside of the window + grace will be excluded from the window.
Aggregate: Aggregate allows us to perform calculations on events during the window timeframe, unlike reduce (used to combine multiple values into a single value), the use of aggregate allows us to return a different type than the input types
A Tumbling window is a sequence of consecutive timeframes which do not overlap. As for the name, think of a tumbling window like a tyre-flip exercise, where the tyre represents the window.
Figure 2: Tumbling Window
Because the timeframes in a tumbling window do not overlap, each event will appear in only 1 window. In the example above you can see that events A, B, C are in window w1, whereas events D, E are in window w2. Note that no events occur in w3, but the window still exists.
Tumbling windows start from t0 and are inclusive of the start time but exclusive of the end time, meaning that if we had 2 consecutive windows of duration 100, and the start time was 0, then an event at t99 would be captured by window 1, whereas an event at t100 would be captured in the subsequent window.
Let’s take a look at an example of the tumbling window and introduce a couple of other window related concepts such as aggregation and suppression, all exercised using the TopologyTestDriver, a great utility that allows you to push data through a topology (provided as part of the kafka streams test-utils package).
Using tests provides a great insight into the actual behaviour, and also being best practice, testing of the topologies is something we would want to see in our codebases.
Deeper Dive -> Tumbling Window Demo/Exploration
A hopping window is a window with a fixed duration like the hopping window, but unlike the tumbling window a new window is created before the previous window closes. This does mean that events can appear in multiple windows.
Hopping windows, like Tumbling windows, are lower bound inclusive and upper bound exclusive. Unlike Tumbling windows the Hopping window requires an advance period. The advance defines the period from the beginning of the current window until the next window is created. Fortunately the API stops you doing weird things like specifying a hop interval being greater than the window duration, which would lead to missed events.
Let’s walk through an explanation..
Figure 3: Hopping Window
In the above diagram we can see that we have a window duration of 2t and an advance of 1t. Window w1 is created at t1, and it contains events, B,C, & D. Window w2 is created at t2, 1t after the creation of window w1. Window w2 contains the events, D & E Because the windows overlap, event D can be seen in both windows w1, and w2. We can see that window w3 would contain only event E. Again, because of the overlap, event E would appear in windows w1 & w2
Let’s take a look at an example of the hopping window and introduce a couple of other window related concepts such as aggregation and suppression, all exercised using our friend, the TopologyTestDriver.
Testing is not only an essential element of any quality implementation, but it is also a great learning environment. Let’s look at the tests…
Deeper Dive -> Hopping Window Demo/Exploration
A Sliding window is a more complex proposition and differs considerably from the other windows we have visited. Whilst Sliding windows also work with a fixed duration, they do not have the same concept of intervals, indeed they do not use fixed intervals, but instead, the creation of the window is tied to events entering and leaving the window duration.
Sliding windows will also give you only 1 window per unique set of contents - you will never get 2 windows with the exact same events.
The diagrams for sliding windows may look busy, and depending on your event load you could have lots of active windows, but with correct usage this windowing mechanism can end up being more a lot more efficient than alternative window strategies as the windows are only created when they are needed.
Another difference with sliding windows is that unlike the Tumbling and Hopping windows which are forward looking, the sliding window looks backwards.
Figure 3: Sliding Window
w1 created when event A is processed, effectively looks backwards and is the only event for that window duration. w1 contains only event A w2 window created when event B is processed. The events in window duration are A & B w3 event A is older than the window duration, so a new window is created. This window includes event B only w4 is created when event C is processed, and will contain events B & C w5 is created when event B drops out of the window duration. Window w5 will contain only event C
Despite the above diagram looking busy, the Sliding window can be an efficient choice as the windows are only created on the processing of an event. If you attempted to recreate this logic with a hopping window and a tiny interval, you would have to process all those windows, and you would have many windows with the same content.
Let’s take a look at some examples of the sliding window in action. We’ll continue to use the aggregation and suppression in our examples to keep things realistic, but simple. We will again rely on the TopologyTestDriver to demonstrate the topology behaviour.
Deeper Dive -> Sliding Window Demo/Exploration
As well as exercising the tests within the IDE it is straight forward to view all the windows being exercised in a Kubernetes environment.
The following article guides you through the steps required to achieve this: Kubernetes Streams Deployments