Hadoop Changelog

New in version 2.5.1

September 15th, 2014
  • Changes since Hadoop 2.5.0:
  • MAPREDUCE-6033. Major bug reported by Yu Gao and fixed by Yu Gao. Users are not allowed to view their own jobs, denied by JobACLsManager.
  • HADOOP-11065. Blocker bug reported by Karthik Kambatla and fixed by Karthik Kambatla. Rat check should exclude **/build/**.
  • HADOOP-11001. Blocker bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scripts). Fix test-patch to work with the git repo.
  • HADOOP-10957. Blocker bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe. The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard.
  • HADOOP-10956. Blocker bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scripts). Fix create-release script to include docs and necessary txt files.

New in version 2.5.0 (September 12th, 2014)

  • Major features and improvements:
  • Authentication improvements when using an HTTP proxy server.
  • A new Hadoop Metrics sink that allows writing directly to Graphite.
  • Specification for Hadoop Compatible Filesystem effort.
  • Support for POSIX-style filesystem extended attributes.
  • OfflineImageViewer to browse an fsimage via the WebHDFS API.
  • Supportability improvements and bug fixes to the NFS gateway.
  • Modernized web UIs (HTML5 and Javascript) for HDFS daemons.
  • YARN's REST APIs support submitting and killing applications.
  • Kerberos integration for the YARN's timeline store.
  • FairScheduler allows creating user queues at runtime under any specified parent queue.

New in version 2.4.1 (July 9th, 2014)

  • Highlights:
  • CVE-2014-0229: Add privilege checks to HDFS admin sub-commands refreshNamenodes, deleteBlockPool and shutdownDatanode.

New in version 2.4.0 (April 23rd, 2014)

  • Significant enhancements:
  • Support for Access Control Lists in HDFS
  • Native support for Rolling Upgrades in HDFS
  • Usage of protocol-buffers for HDFS FSImage for smooth operational upgrades
  • Complete HTTPS support in HDFS
  • Support for Automatic Failover of the YARN ResourceManager
  • Enhanced support for new applications on YARN with Application History Server and Application Timeline Server
  • Support for strong SLAs in YARN CapacityScheduler via Preemption

New in version 2.3.0 (February 28th, 2014)

  • Highlights:
  • Support for Heterogeneous Storage hierarchy in HDFS.
  • In-memory cache for HDFS data with centralized administration and management.
  • Simplified distribution of MapReduce binaries via HDFS in YARN Distributed Cache

New in version 2.2.0 (October 22nd, 2013)

  • This release has a number of significant highlights compared to Hadoop 1.x:
  • YARN - A general purpose resource management system for Hadoop to allow MapReduce and other other data processing frameworks and services
  • High Availability for HDFS
  • HDFS Federation
  • HDFS Snapshots
  • NFSv3 access to data in HDFS
  • Support for running Hadoop on Microsoft Windows
  • Binary Compatibility for MapReduce applications built on hadoop-1.x
  • Substantial amount of integration testing with rest of projects in the ecosystem
  • A couple of important points to note while upgrading to hadoop-2.2.0:
  • HDFS - The HDFS community decided to push the symlinks feature out to a future 2.3.0 release and is currently disabled.
  • YARN/MapReduce - Users need to change ShuffleHandler service name from mapreduce.shuffle to mapreduce_shuffle.

New in version 2.1.0 Beta (September 2nd, 2013)

  • HDFS Snapshots
  • Support for running Hadoop on Microsoft Windows
  • YARN API stabilization
  • Binary Compatibility for MapReduce applications built on hadoop-1.x
  • Substantial amount of integration testing with rest of projects in the ecosystem

New in version 1.2.1 (August 5th, 2013)

  • CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
  • MetricsDynamicMBeanBase has concurrency issues in createMBeanInfo
  • BlockDecompressorStream#decompress will throw EOFException instead of return -1 when EOF
  • fix hadoop.spec to add task-log4j.properties
  • TestBalancerWithNodeGroup times out
  • DataNode#checkDiskError should not be called on network errors
  • TestPipelinesFailover#testPipelineRecoveryStress fails sporadically
  • Diagnostic logging while loading name/edits files
  • Add extra info to JH files
  • Syslog missing from Map/Reduce tasks
  • JT can show the same job multiple times in Retired Jobs section
  • CombineInputFormat isn't thread safe affecting HiveServer
  • Job failed because of JvmManager running into inconsistent state
  • Ampersand in JSPUtil.java is not escaped
  • JobTracker memory leak caused by CleanupQueue reopening FileSystem
  • Deadlock between RenewalTimerTask methods cancel() and run()
  • Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress
  • Delegation Token renewal exception in jobtracker logs

New in version 2.0.5 Alpha (June 10th, 2013)

  • This release delivers a number of critical bug-fixes for hadoop-2.x uncovered during integration testing of previous release.