News and Events

SpringOne2GX 2015 replay: Hadoop Workflows, Distributed YARN Apps and Spring

Recorded at SpringOne2GX 2015
Presenter: Thomas Risberg
Big Data Track

The Hadoop ecosystem is getting bigger and more complex. Using multiple projects from this ecosystem, you will have to deal with the difference in philosophy and usage patterns that these project promote. The “Spring for Apache Hadoop” project uses many Spring projects like Data, Integration, Batch and Boot to resolve many of these issues. It simplifies developing for Apache Hadoop by providing a unified configuration model and easy to use APIs for using HDFS, MapReduce, Pig, and Hive. You can leverage your existing Java and Spring skills when making the jump to write applications and workflows for Apache Hadoop if you use the “Spring for Apache Hadoop” project. In this presentation we will see how it can make developing workflows with Map Reduce, Spark, Hive and Pig jobs easier, while providing portability across Apache, Cloudera, Hortonworks, and Pivotal distros.

We will also show how useful Spring Cloud is when building distributed apps which can be run on Hadoop YARN using centralized configuration, leader election, distributed locks and states.

comments powered by Disqus