DaCapo Benchmark Suite Changelog

What's new in DaCapo Benchmark Suite 9.12

Mar 24, 2012

Benchmark additions since 2006-10-MR2:
avrora: AVRORA is a set of simulation and analysis tools in a framework for AVR micro-controllers. The benchmark exhibits a great deal of fine-grained concurrency. The benchmark is courtesy of Ben Titzer (Sun Microsystems) and was developed at UCLA.
batik: Batik is an SVG toolkit produced by the Apache foundation. The benchmark renders a number of svg files.
h2: h2 is an in-memory database benchmark, using the h2 database produced by h2database.com, and executing an implementation of the TPC-C workload produced by the Apache foundation for its derby project. h2 replaces derby, which in turn replaced hsqldb.
sunflow: Sunflow is a raytracing rendering system for photo-realistic images.
tomcat: Tomcat uses the Apache Tomcat servelet container to run some sample web applications.
tradebeans: Tradebeans runs the Apache daytrader workload "directly" (via EJB) within a Geronimo application server. Daytrader is derived from the IBM Trade6 benchmark.
tradesoap: Tradesoap is identical to the tradebeans workload, except that client/server communications is via soap protocols (and the workloads are reduced in size to compensate the substantially higher overhead).
Note that tradebeans and tradesoap were intentionally added as a pair to allow researchers to evaluate and analyze the overheads and behavior of communicating through a protocol such as SOAP. Tradesoap's "large" configuration uses exactly the same workload as tradebeans' "default" configuration, and tradesoap's "huge" uses exactly the same workload as tradebeans' "large", allowing researchers to directly compare the two systems.
Benchmark deletions:
antlr: Antlr is single threaded and highly repetitive. The most recent version of jython uses antlr; so antlr remains represented within the DaCapo suite.
bloat: Bloat is not as widely used as our other workloads and the code exhibited some pathologies that were arguably not representative or desirable in a suite that was to be representative of modern Java applications.
chart: Chart was repetitive and used a framework that appears not to be as widely used as most of the other DaCapo benchmarks. The Batik workload has some similarities with chart (both are render vector graphics), but is part of a larger heavily used framework from Apache.
derby: Derby has been replaced by h2, which runs a much richer workload and uses a more widely used and higher performing database engine (derby was not in any previous release, but had been slated for inclusion in this release).
hsqldb: Hsqldb has been replaced by h2, which runs a much richer workload and uses a more widely used and higher performing database engine.
Benchmark updates:
All other benchmarks have been updated to reflect the latest release of the underlying application.
Other Notable Changes:
The packaging of the DaCapo suite has been completely re-worked and the source code is entirely re-organized.
The developers have changed the naming scheme for the releases. Rather than "dacapo-YYYY-MM", we've moved to "dacapo-Y.M-TAG", where TAG is a nickname for the release. Given the theme for this project, we're using musical names, and since this release is our second, we've given this one the nick-name "bach". The release can therefore be referred to by its nickname, which rolls off the tounge a little more easily than our old names. Of course we've borrowed this scheme from other projects (such as Ubuntu) which follow a similar pattern.
The command-line arguments have be rationalized and now follow posix conventions.
Threading has been rationalized. Benchmarks are now characterized in terms of their external and internal concurrency. (For example a benchmark such as eclipse is single-threaded externally, but internally uses a thread pool). All benchmarks which are externally multi-threaded now by default run a number of threads scaled to match the available processors, and the number of externally defined threads may also be configured via the "-t" and "-k" command line options which specify, respectively the absolute number of external threads and a multiplier against the number of available processors. Some benchmarks are both internally and externally multithreaded, such as tradebeans and tradesoap, where the number of client threads may be specified externally, but the number of server threads is determined within the server, and cannot be directly controlled by the user.
The developers have introduced a "huge" size for a number of benchmarks, which scales the workload to run for much longer and consume significant memory. We have also retired "large" sizes for some benchmarks where "large" was not distinctly different from "default". Thus there are now four sizes: "small", "default", "large", and "huge", and "large" and "huge" are only available for some benchmarks. If you attempt to run a benchmark at an unsupported size you will get an error message.