Hadoop Changelog

New in version 2.7.1

July 7th, 2015
  • Bug HADOOP-12103: Small refactoring of DelegationTokenAuthenticationFilter to allow code sharing
  • Bug HADOOP-12100: ImmutableFsPermission should not override applyUmask since that method doesn't modify the FsPermission
  • Bug HADOOP-12078: The default retry policy does not handle RetriableException correctly
  • Bug HADOOP-12058: Fix dead links to DistCp and Hadoop Archives pages.
  • Bug HADOOP-11934: Use of JavaKeyStoreProvider in LdapGroupsMapping causes infinite loop
  • Bug HADOOP-11973: Ensure ZkDelegationTokenSecretManager namespace znodes get created with ACLs
  • Bug HADOOP-11966: Variable cygwin is undefined in hadoop-config.sh when executed through hadoop-daemon.sh.
  • Bug HADOOP-11663: Remove description about Java 6 from docs
  • Sub-task HADOOP-7468: HADOOP-7466 hadoop-core JAR contains a log4j.properties file
  • Improvement HADOOP-9384: Update S3 native fs implementation to use AWS SDK to support authorization through roles
  • Bug HADOOP-9658: SnappyCodec#checkNativeCodeLoaded may unexpectedly fail when native code is not loaded
  • Bug HADOOP-11891: OsSecureRandom should lazily fill its reservoir
  • Bug HADOOP-11872: "hadoop dfs" command prints message about using "yarn jar" on Windows(branch-2 only)
  • Bug HADOOP-11730: Regression: s3n read failure recovery broken
  • Bug HADOOP-11802: DomainSocketWatcher thread terminates sometimes after there is an I/O error during requestShortCircuitShm
  • Improvement HADOOP-11851: s3n to swallow IOEs on inner stream close
  • Bug HADOOP-11868: Invalid user logins trigger large backtraces in server log

New in version 2.7.0 (April 23rd, 2015)

  • Common:
  • Authentication improvements when using an HTTP proxy server. This is useful when accessing WebHDFS via a proxy server.
  • A new Hadoop metrics sink that allows writing directly to Graphite.
  • Specification work related to the Hadoop Compatible Filesystem (HCFS) effort.
  • HDFS:
  • Support for POSIX-style filesystem extended attributes. See the user documentation for more details.
  • Using the OfflineImageViewer, clients can now browse an fsimage via the WebHDFS API.
  • The NFS gateway received a number of supportability improvements and bug fixes. The Hadoop portmapper is no longer required to run the gateway, and the gateway is now able to reject connections from unprivileged ports.
  • The SecondaryNameNode, JournalNode, and DataNode web UIs have been modernized with HTML5 and Javascript.
  • YARN:
  • YARN’s REST APIs now support write/modify operations. Users can submit and kill applications through REST APIs.
  • The timeline store in YARN, used for storing generic and application-specific information for applications, supports authentication through Kerberos.
  • The Fair Scheduler supports dynamic hierarchical user queues, user queues are created dynamically at runtime under any specified parent-queue.

New in version 2.6.0 (December 27th, 2014)

  • Common:
  • Authentication improvements when using an HTTP proxy server. This is useful when accessing WebHDFS via a proxy server.
  • A new Hadoop metrics sink that allows writing directly to Graphite.
  • Specification work related to the Hadoop Compatible Filesystem (HCFS) effort.
  • HDFS:
  • Support for POSIX-style filesystem extended attributes. See the user documentation for more details.
  • Using the OfflineImageViewer, clients can now browse an fsimage via the WebHDFS API.
  • The NFS gateway received a number of supportability improvements and bug fixes. The Hadoop portmapper is no longer required to run the gateway, and the gateway is now able to reject connections from unprivileged ports.
  • The SecondaryNameNode, JournalNode, and DataNode web UIs have been modernized with HTML5 and Javascript.
  • YARN:
  • YARN's REST APIs now support write/modify operations. Users can submit and kill applications through REST APIs.
  • The timeline store in YARN, used for storing generic and application-specific information for applications, supports authentication through Kerberos.
  • The Fair Scheduler supports dynamic hierarchical user queues, user queues are created dynamically at runtime under any specified parent-queue.

New in version 2.5.1 (September 15th, 2014)

  • Changes since Hadoop 2.5.0:
  • MAPREDUCE-6033. Major bug reported by Yu Gao and fixed by Yu Gao. Users are not allowed to view their own jobs, denied by JobACLsManager.
  • HADOOP-11065. Blocker bug reported by Karthik Kambatla and fixed by Karthik Kambatla. Rat check should exclude **/build/**.
  • HADOOP-11001. Blocker bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scripts). Fix test-patch to work with the git repo.
  • HADOOP-10957. Blocker bug reported by Colin Patrick McCabe and fixed by Colin Patrick McCabe. The globber will sometimes erroneously return a permission denied exception when there is a non-terminal wildcard.
  • HADOOP-10956. Blocker bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scripts). Fix create-release script to include docs and necessary txt files.

New in version 2.5.0 (September 12th, 2014)

  • Major features and improvements:
  • Authentication improvements when using an HTTP proxy server.
  • A new Hadoop Metrics sink that allows writing directly to Graphite.
  • Specification for Hadoop Compatible Filesystem effort.
  • Support for POSIX-style filesystem extended attributes.
  • OfflineImageViewer to browse an fsimage via the WebHDFS API.
  • Supportability improvements and bug fixes to the NFS gateway.
  • Modernized web UIs (HTML5 and Javascript) for HDFS daemons.
  • YARN's REST APIs support submitting and killing applications.
  • Kerberos integration for the YARN's timeline store.
  • FairScheduler allows creating user queues at runtime under any specified parent queue.

New in version 2.4.1 (July 9th, 2014)

  • Highlights:
  • CVE-2014-0229: Add privilege checks to HDFS admin sub-commands refreshNamenodes, deleteBlockPool and shutdownDatanode.

New in version 2.4.0 (April 23rd, 2014)

  • Significant enhancements:
  • Support for Access Control Lists in HDFS
  • Native support for Rolling Upgrades in HDFS
  • Usage of protocol-buffers for HDFS FSImage for smooth operational upgrades
  • Complete HTTPS support in HDFS
  • Support for Automatic Failover of the YARN ResourceManager
  • Enhanced support for new applications on YARN with Application History Server and Application Timeline Server
  • Support for strong SLAs in YARN CapacityScheduler via Preemption

New in version 2.3.0 (February 28th, 2014)

  • Highlights:
  • Support for Heterogeneous Storage hierarchy in HDFS.
  • In-memory cache for HDFS data with centralized administration and management.
  • Simplified distribution of MapReduce binaries via HDFS in YARN Distributed Cache

New in version 2.2.0 (October 22nd, 2013)

  • This release has a number of significant highlights compared to Hadoop 1.x:
  • YARN - A general purpose resource management system for Hadoop to allow MapReduce and other other data processing frameworks and services
  • High Availability for HDFS
  • HDFS Federation
  • HDFS Snapshots
  • NFSv3 access to data in HDFS
  • Support for running Hadoop on Microsoft Windows
  • Binary Compatibility for MapReduce applications built on hadoop-1.x
  • Substantial amount of integration testing with rest of projects in the ecosystem
  • A couple of important points to note while upgrading to hadoop-2.2.0:
  • HDFS - The HDFS community decided to push the symlinks feature out to a future 2.3.0 release and is currently disabled.
  • YARN/MapReduce - Users need to change ShuffleHandler service name from mapreduce.shuffle to mapreduce_shuffle.

New in version 2.1.0 Beta (September 2nd, 2013)

  • HDFS Snapshots
  • Support for running Hadoop on Microsoft Windows
  • YARN API stabilization
  • Binary Compatibility for MapReduce applications built on hadoop-1.x
  • Substantial amount of integration testing with rest of projects in the ecosystem

New in version 1.2.1 (August 5th, 2013)

  • CapacityScheduler incorrectly utilizes extra-resources of queue for high-memory jobs
  • MetricsDynamicMBeanBase has concurrency issues in createMBeanInfo
  • BlockDecompressorStream#decompress will throw EOFException instead of return -1 when EOF
  • fix hadoop.spec to add task-log4j.properties
  • TestBalancerWithNodeGroup times out
  • DataNode#checkDiskError should not be called on network errors
  • TestPipelinesFailover#testPipelineRecoveryStress fails sporadically
  • Diagnostic logging while loading name/edits files
  • Add extra info to JH files
  • Syslog missing from Map/Reduce tasks
  • JT can show the same job multiple times in Retired Jobs section
  • CombineInputFormat isn't thread safe affecting HiveServer
  • Job failed because of JvmManager running into inconsistent state
  • Ampersand in JSPUtil.java is not escaped
  • JobTracker memory leak caused by CleanupQueue reopening FileSystem
  • Deadlock between RenewalTimerTask methods cancel() and run()
  • Save memory by set capacity, load factor and concurrency level for ConcurrentHashMap in TaskInProgress
  • Delegation Token renewal exception in jobtracker logs

New in version 2.0.5 Alpha (June 10th, 2013)

  • This release delivers a number of critical bug-fixes for hadoop-2.x uncovered during integration testing of previous release.

New in version 1.2.0 (June 10th, 2013)

  • DistCp v2 backported
  • Web services for JobTracker
  • WebHDFS enhancements
  • Extensions of task placement and replica placement policy interfaces
  • Offline Image Viewer backported
  • Namenode more robust in case of edit log corruption
  • Add NodeGroups level to NetworkTopology
  • Add "unset" to Configuration API

New in version 2.0.3 Alpha (February 18th, 2013)

  • QJM for HDFS HA for NameNode
  • Multi-resource scheduling (CPU and memory) for YARN
  • YARN ResourceManager Restart
  • Significant stability at scale for YARN (over 30,000 nodes and 14 million applications so far, at time of release)

New in version 1.1.1 (December 4th, 2012)

  • Bug fixes and improvements

New in version 2.0.2 Alpha (October 18th, 2012)

  • This delivers significant enhancements to HDFS HA. Also it has a significantly more stable version of YARN which, at the time of release, has already been deployed on a 2000 node cluster.

New in version 1.0.4 (October 18th, 2012)

  • Security issue CVE-2012-4449: Hadoop tokens use a 20-bit secret
  • HADOOP-7154 - set MALLOC_ARENA_MAX in hadoop-config.sh to resolve problems with glibc in RHEL-6
  • HDFS-3652 - FSEditLog failure removes the wrong edit stream when storage dirs have same name
  • MAPREDUCE-4399 - Fix (up to 3x) performance regression in shuffle