Get ahead
VMware offers training and certification to turbo-charge your progress.
Learn moreWe are pleased to announce the Spring for Apache Hadoop 2.0 M5 milestone releases. We are moving closer to a release candidate, so this is a good time to highlight what is new in this 2.0 version and how it compares to 1.0.
Spring for Apache Hadoop 1.0 primarily targets using HDFS and MapReduce with either MapReduce v1 or MapReduce v2 (YARN). The default distribution is Apache Hadoop 1.2.1 with additional "flavors" provided for other distributions: Hadoop 2.2.0, Pivotal HD 1.1, Cloudera CDH4 MR1 or MR2 YARN and Hortonworks HDP 1.3.
The main focus for Spring for Apache Hadoop 2.0 is to add YARN application development support in addition to continue to provide improvements in the HDFS and MapReduce support. The default distribution for the 2.0 releases going forward is Apache Hadoop 2.2.0.
We continue to provide version specific artifacts with their respective transitive dependencies in the Spring IO milestone repository:
The most important enhancements in the Spring for Apache Hadoop 2.0 version:
You can see many of these new YARN features in use in the YARN samples
We have also added a spring-data-hadoop-store sub-project to provide better support for writing data to HDFS using DataWriter and DataReader implementation supporting formats like text files and SequenceFiles with or without compression. This new sub-project also integrates with the Dataset support from Kite SDK.
For more project specific information please see the project page.