Although interceptors aren’t typically your first line for debugging, they can prove useful in observing the behavior of your Kafka streaming application, and they’re a valuable addition to your toolbox.
2. Application metrics
3. More Kafka Streams debugging techniques
8. Testing a Kafka Streams application
1. Testing a topology
ProcessorTopologyTestDriver
The critical point to keep in mind with this test is that you now have a repeatable test running a record through your entire topology, without the overhead of running Kafka.
2. Integration testing
Integration tests with the EmbeddedKafkaCluster should be used sparingly, and only when you have interactive behavior that can only be verified with a live,running Kafka broker.
The state stores provided by Kafka Streams meet both the locality and fault-tolerance requirements. They’re local to the defined processors and don’t share access across processes or threads. State stores also use topics for backup and quick recovery.
6. Joining streams for added insight
Generating keys containing customer IDs to perform joins
Constructing the join
With an inner join, if either record isn’t present, the join doesn’t occur .Outer joins always output a record .
7. Timestamps in Kafka Streams
5. The KTable API
1. The relationship between streams and tables
The record stream
Updates to records or the changelog
2. Record updates and KTable configuration
3. Aggregations and windowing operations
Aggregating share volume by industry
Windowing operations
There are a couple of key points to remember from this section: Sessions are not a fixed-size window. Rather, the size of a session is driven by the amount of activity within a given time frame. Timestamps in the data determine whether an event fits into an existing session or falls into an inactivity gap.
By not specifying the duration of the window, you’ll get the default retention of 24 hours .
Session windows aren’t fixed by time but are driven by user activity. Tumbling windows give you a set picture of events within the specified time frame. Hopping windows are of fixed length, but they’re frequently updated and can contain overlapping records in each window.
In conclusion, the key thing to remember is that you can combine event streams (KStream) and update streams (KTable), using local state. Additionally, when the lookup data is of a manageable size, you can use a GlobalKTable. GlobalKTables replicate all partitions to each node in the Kafka Streams application, making all data available, regardless of which partition the key maps to.
6.The Processor API
1. The trade-offs of higher-level abstractions vs. more control
It’s a DSL that allows developers to create robust applications with minimal code. The ability to quickly put together processing topologies is an important feature of the Kafka Streams DSL.
It allows you to iterate quickly to flesh out ideas for working on your data without getting bogged down in the intricate setup details that some other frameworks may need.
What the Processor API lacks in ease of development, it makes up for in power. You can write custom processors to do almost anything you want.
2.Working with sources, processors, and sinks to create a topology
3. Digging deeper into the Processor API with a stock analysis processor
4. The co-group processor
The Processor API gives you more flexibility at the cost of more code .
3. When to use stream processing, and when not to use it
If you need to report on or take action immediately as data arrives, stream processing is a good approach. If you need to perform in-depth analysis or are compiling a large repository of data for later analysis, a stream-processing approach may not be a good fit.
4. Deconstructing the requirements into a graph
5. Applying Kafka Streams to the purchase transaction flow