close

Soby Chacko

Soby Chacko

Spring Cloud Stream/Data Flow

Philadelphia, PA

Blog Posts by Soby Chacko

Case Study: Relational Database Source and File Sink

This article is part of a blog series that explores the newly redesigned Spring Cloud Stream applications based on Java Functions. In this episode, we are exploring the JDBC supplier and the source based on Spring Cloud Stream. We will see how we can export data from a relational database and dump it into a file using a File Consumer and the corresponding Spring Cloud Stream File sink. We will look at a few different ways in which we can run JDBC Source and send the data to a file.

Here are all the previous parts of this blog series.

Read more...

Case Study: Reading from a file and writing to MongoDB

This article is part of a blog series that explores the newly redesigned Spring Cloud Stream applications based on Java Functions. In this episode, we are taking a deeper look into the file supplier and its Spring Cloud Stream file source counterpart. We will also see a MongoDB consumer and its corresponding Spring Cloud Stream sink. Finally, we will demonstrate how the File source and the MongoDB sink can be orchestrated together on Spring Cloud Data Flow as a pipeline.

Here are all the previous parts of this blog series.

Read more...

Creating a function for consuming data and generating Spring Cloud Stream Sink applications

This is part 4 of the blog series in which we are introducing java functions for Spring Cloud Stream applications.

Other parts in the series.

In the last blog in this series, we saw how we can use a java.util.function.Supplier to generate a Spring Cloud Stream source. In this new edition, we will see how a consuming function can be developed and tested using java.util.function.Consumer and java.util.function.Function. Later on, we will briefly explain the generation of a Spring Cloud Stream sink application from this consumer.

Read more...

Creating a Supplier Function and generating Spring Cloud Stream Source

This is part 3 of the blog series in which we are introducing java functions for Spring Cloud Stream applications.

Other parts in the series.

In the last two blogs in this series, we provided a general introduction to this new initiative of migrating all the existing Spring Cloud Stream App Starters to functions and the various ways in which we can compose them. In this blog, we continue the series, showing how these functions are developed, tested, and used to generate Spring Cloud Stream applications. In particular, here we are focusing on how to write a supplier function (implementing java.util.function.Supplier) and then generate the corresponding source application for Spring Cloud Stream.

Read more...

Stream Processing with Spring Cloud Stream and Apache Kafka Streams. Part 6 - State Stores and Interactive Queries

Part 1 - Programming Model
Part 2 - Programming Model Continued
Part 3 - Data deserialization and serialization
Part 4 - Error Handling
Part 5 - Application Customizations

In this part (the sixth and final one of this series), we are going to look into the ways Spring Cloud Stream Binder for Kafka Streams supports state stores and interactive queries in Kafka Streams.

Named State Stores

When you have the need to maintain state in the application, Kafka Streams lets you materialize that state information into a named state store. There are several operations in Kafka Streams that require it to keep track of the state such as count, aggregate, reduce, various windowing operations, and others. Kafka Streams uses a special database called RocksDB for maintaining this state store in most cases (unless you explicitly change the store type). By default, the same information in the state store is backed up to a changelog topic as well as within Kafka, for fault-tolerant reasons.

Read more...

Stream Processing with Spring Cloud Stream and Apache Kafka Streams. Part 5 - Application Customizations

Part 1 - Programming Model
Part 2 - Programming Model Continued
Part 3 - Data deserialization and serialization
Part 4 - Error Handling

In this blog post, we continue our discussion on the support for Kafka Streams in Spring Cloud Stream. We are going to elaborate on the ways in which you can customize a Kafka Streams application.

Customizing the StreamsBuilderFactoryBean

Kafka Streams binder uses the StreamsBuilderFactoryBean, provided by the Spring for Apache Kafka project, to build the StreamsBuilder object that is the foundation for a Kafka Streams application. This factory bean is a Spring lifecycle bean. Oftentimes, this factory bean must be customized before it is started, for various reasons. As described in the previous blog post on error handling, you need to customize the StreamsBuilderFactoryBean if you want to register a production exception handler. Let’s say that you have this producer exception handler:

Read more...

Stream Processing with Spring Cloud Stream and Apache Kafka Streams. Part 4 - Error Handling

Part 1 - Programming Model
Part 2 - Programming Model Continued
Part 3 - Data deserialization and serialization

Continuing with the series on looking at the Spring Cloud Stream binder for Kafka Streams, in this blog post, we are looking at the various error-handling strategies that are available in the Kafka Streams binder.

The error handling in Kafka Streams is largely centered around errors that occur during deserialization on the inbound and during production on the outbound.

Handling Deserialization Exceptions

Read more...

Stream Processing with Spring Cloud Stream and Apache Kafka Streams. Part 3 - Data deserialization and serialization

Part 1 - Programming Model
Part 2 - Programming Model Continued

Continuing on the previous two blog posts, in this series on writing stream processing applications with Spring Cloud Stream and Kafka Streams, now we will look at the details of how these applications handle deserialization on the inbound and serialization on the outbound.

All three major higher-level types in Kafka Streams - KStream<K,V>, KTable<K,V> and GlobalKTable<K,V> - work with a key and a value.

With Spring Cloud Stream Kafka Streams support, keys are always deserialized and serialized by using the native Serde mechanism. A Serde is a container object where it provides a deserializer and a serializer.

Read more...

Stream Processing with Spring Cloud Stream and Apache Kafka Streams. Part 2 - Programming Model Continued

On the heels of the previous blog in which we introduced the basic functional programming model for writing streaming applications with Spring Cloud Stream and Kafka Streams, in this part, we are going to further explore that programming model.

Let’s look at a few scenarios.

Scenario 1: Single input and output binding

If your application consumes data from a single input binding and produces data into an output binding, you can use Java’s Function interface to do that. Keep in mind that binding in this sense is not necessarily mapped to a single input Kafka topic, because topics could be multiplexed and attached to a single input binding (with comma-separated multiple topics configured on a single binding - see below for an example). On the outbound case, the binding maps to a single topic here.

Read more...

Stream Processing with Spring Cloud Stream and Apache Kafka Streams. Part 1 - Programming Model

This is the first in a series of blog posts in which we will look at how stream processing applications are written using Spring Cloud Stream and Kafka Streams.

The Spring Cloud Stream Horsham release (3.0.0) introduces several changes to the way applications can leverage Apache Kafka using the binders for Kafka and Kafka Streams.
One of the major enhancements that this release brings to the table is first class support for writing apps by using a fully functional programming paradigm. This blog post gives an introduction to how this functional programming model can be used to develop stream processing applications with Spring Cloud Stream and Kafka Streams. In the subsequent blog posts in this series, we will look into more details.

Read more...