Mark Pollack

Mark Pollack is a software engineer with Pivotal and is the lead of the Spring Cloud Data Flow project. He has been a contributor to many Spring projects dating back to the Spring Framework in 2003 as well as founding the Spring.NET and Spring Data projects.

Recent Blog posts by Mark Pollack

Spring XD 1.1 M1 and 1.0.2 released

Engineering | November 19, 2014 | ...

On behalf of the Spring XD team, I am very pleased to announce the first milestone release of Spring XD 1.1 and the 1.0.2 maintenance release.

Download Links:

1.0.2.RELEASE: zip, 1.1.0.M1 RELEASE: zip

In addition to bug fixes, Spring XD 1.0.2 now supports Apache Hadoop 2.5.1. Pivotal PHD 2.1 and Cloudera CDH 5.1.3.

The 1.1 M1 release includes bug fixes and enhancements as well as several new features:

Run a Spark Application as a batch job
Python based processors and sinks
Kafka source and sink
HDFS sink supports writing to a Kerberized Hadoop Cluster
LDAP Authentication
Modules can be created using Java @Configuration classes as an alternative to XML
Redis sink and JDBC Source
Hadoop distribution updates. Support for Apache Hadoop 2,4.1/2.5.1 (default), Pivotal PHD 2.0/2.1, Cloudera CDH 5.2, and Hortonworks HDP 2.1. Apache Hadoop 2.2 support was removed
Configurable location of Spring XD’s top level node in Zookeeper

Spring XD 1.0.1 released

Releases | October 02, 2014 | ...

On behalf of the Spring XD team, I am very pleased to announce the general availability of Spring XD 1.0.1!

This release includes bug fixes and enhancements as well as some new features:

HTTPS access and Authentication to Admin Server
Cluster and Stream views in UI
Configure a location for custom modules
Null sink

You can download the zip distribution or install on OSX using homebrew. On RHEL/CentOs you can install using yum.

Feedback is very important, so please get in touch with questions and comments via

StackOverflow spring-xd tag
Spring JIRA or GitHub Issues

Spring XD 1.0 GA Released

Releases | July 30, 2014 | ...

On behalf of the Spring XD team, I am very pleased to announce the general availability of Spring XD 1.0! You can download the zip distribution. You can also install on OSX using homebrew and on RHEL/CentOs using yum.

Spring XD's goal is to be your one stop shop for developing and deploying Big Data Applications. Such applications require a wide range of technologies to address different use-cases while interoperating as a cohesive process. The steps in this process include:

Data collection
Real-time streaming and analytics
Data cleansing
Batch processing (both on and off Hadoop)
Machine learning and exploratory data analysis
Visualization and Reporting
Closed loop analytics between real-time and batch processing

Spring XD 1.0.0.RC1 Released

Releases | July 18, 2014 | ...

The Spring XD team is pleased to announce that Spring XD Release Candidate 1 is now available for download. You can also install Spring XD on OSX using homebrew and on RHEL/CentOs using yum.

Highlights of this release

Direct binding: Deployments can be configured to avoid modules sending data over the Message Bus if they are co-located in the same container. Using this option increases throughput and lowers latency but can not be applied to all deployment topologies.
Stream Deployment State: The state of stream is calculated throughout the lifetime of the deployment. For example, if a subset of the modules that comprise a stream have failed, the overall state of the stream changes from Deployed to Incomplete. Once the failures have been addressed, the state of the stream returns to Deployed.
Improved REST API…

Spring XD 1.0.0.M7 Released

Releases | June 03, 2014 | ...

The Spring XD team is pleased to announce that Spring XD Milestone 7 is now available for download.

Highlights of this release

Transport Data Partitioning: By default, messages are delivered to multiple instances of a stream module in a round-robin manner. However, if a module performs operations such that it can not consume random messages from the stream, then you can partition the stream based on its content so that similar messages are always delivered to the same module instance. For example, if a processing module is performing stateful operations on a per-customer basis, the stream…

Spring XD 1.0.0.M6 Released

Releases | April 16, 2014 | ...

The Spring XD team is pleased to announce that Spring XD Milestone 6 is now available for download.

This is our biggest release yet! The team has been hard at work, and Milestone 6 contains a wealth of new features that meet enterprise requirements in terms of reliability, performance, and user experience. Below is a quick Top Ten (in no particular order), but if you checkout the release notes you will realize how difficult it is to pick out 10 from the list of 299.

Distributed and Fault Tolerant Runtime: Leader election among multiple xd-admin servers and automatic redeployment of modules to other xd-containers in the case of failure. ZooKeeper is introduced to manage the cluster and its deployment state.
Support for running XD on YARN: Run admin and container nodes on a Hadoop YARN cluster rather than on VMs or physical servers that you need to manage. There are simple configuration and shell scripts that make this process very easy.
Deployment Manifests: When deploying a stream you can provide a deployment manifest that describes how to transform the logical stream definition (e.g. http | hdfs) to a physical deployment on the cluster. You can specify the number of instances of each module to deploy and also a criteria expression (using SpEL) that evaluates each of the available containers in the cluster to determine the best matches for those module instances. This will be an area of active development for the next release as we extend the manifest to include support for data partitioning strategies.
…

Spring Shell 1.1 RC1 Released

Releases | April 03, 2014 | ...

We are pleased to announce the release of Spring Shell 1.1 RC1. The Spring Shell is an interactive shell that can be easily extended with commands using a Spring based programming model.

This is a small bug fix release but includes an important improvement, the upgrade to use the JLine2 library and rewrite of the command parser. Check the release notes for more information. Special thanks to Eric Bottard and to those who submitted pull-requests.

Downloads | JavaDocs | Reference Documentation | Changelog

Spring XD 1.0.0.M5 Released

Engineering | January 10, 2014 | ...

The Spring XD team is pleased to announce that Spring XD 1.0.0 Milestone 5 is now available for download.

Spring XD makes it easy to solve common big data problems such as data ingestion and export, real-time analytics, and batch workflow orchestration. This release includes several notable new features:

Pre-defined batch jobs for JDBC to HDFS
Many additional job shell commands for controlling and reporting on batch jobs.
Hadoop Dataset Avro sink that uses the Kite SDK
HDFS sink now supports codecs (gzip, snappy, bzip2, lzo) and fine-grained control over file naming
JMS source module now support Topics in addition to Queues
Improved support for use of Gemfire Locators
Aggregator module for batching that also supports a backing message store for checkpointing the event stream to memory, Redis, or a relational database.
Support for separate control and data transport protocols
XD servers are now built on top of …

Spring XD 1.0 Milestone 2 Released

Releases | August 14, 2013 | ...

Today we are pleased to announce the 1.0 M2 release of Spring XD (download) Spring XD is a unified, distributed, and extensible system for data ingestion, real time analytics, batch processing, and data export. The project’s goal is to simplify the development of big data applications.

The second milestone release of Spring XD introduces several new features that make it even easier to ingest and process real-time streams of data as well as orchestrate Hadoop based batch jobs. In this blog post we will cover

Shell
New sources, sinks and transports
DSL improvements
Batch Jobs

Shell

The most noticeable new feature is the introduction of the interactive shell. The shell provides you an easy way to create new streams and jobs, view metrics, interact with Hadoop, and more. As an introduction to the shell I will redo some of the examples from the M1 blog post.

Start…

Spring Shell 1.1.0.M1 Released

Releases | July 26, 2013 | ...

Dear Spring Community,

I am pleased to announce the first milestone release Spring Shell 1.1. Spring Shell is an interactive shell that can be easily extended with commands using a Spring based programming model. This release adds support for testing of commands as well as several bug fixes and general improvements. Many thanks to to those who submitted pull-requests