DatasetExplorer is a free and open-source Java-based app for exploring and searching in large collections of annotated images.
DatasetExplorer is based on the Galatee library, which is a library developed for bringing fast, convenient, easily usable GUI components for browsing-searching image collections and annotated images: https://code.google.com/p/galatee/
The DatasetExplorer application (as its name indicates it) is mainly dedicated to explore learning datasets in the framework of automatic image annotation (http://njames.trevize.net/docs-repo/public/publications/jamesHudelotCIAM09.pdf).
Detailed instructions on how to install and use the DatasetExplorer utility on your Mac are available HERE.
DatasetExplorer is a cross-platform utility capable of running on any operating system that comes with Java support (e.g. Mac OS X, Windows, Linux).
Here are some key features of "DatasetExplorer":
a dataset can be represented by:
· a directory (with eventually sub-directories): As for instance Caltech 101 (http://www.vision.caltech.edu/Image_Datasets/Caltech101/) or the University of Washington Image Database (http://www.cs.washington.edu/research/imagedatabase/).
· a TAR archive: As for instance in the ImageNet image database. The tar archive is not unpacked, the Galatee library uses Apache Commons VFS for getting data directly from the tar file.
· a text file that contains filepath to images: the file can contain relative paths, in this case you have to specify a path prefix.
· a text file that contains URI to images: the files are downloaded in a temporary directory. As previously, you can specify an URI prefix, so the file can contains relative paths. As previously you can add textual annotations in the text file (with the same format that previously).
· a location that contains an instance of an IIDF model.
· a PascalVOC dataset.
· visualization of a list of images (with associated metadata): images are referenced by a URI object. Schemes of the URI can be file for a local image file, or http for an image file accessible via HTTP,
· textual search in the image list (based on the value of the image URI and on the image description) using a Lucene in-memory index,
· downloading, in-memory loading and resizing of an image is made only when it's necessary (something like a load when you see),
· multi-threading for the downloading, loading, resizing of the images,
configuration via the properties files DatasetExplorer.properties and Galatee.properties:
· customize the item visualization (image size, text area size),
· cache directory for downloaded files.
· Java Plugin Framework is used to support an extension system by plugins, you can add your own plugins into DatasetExplorer (for instance the OpenCV Haar Classifier Cascade plugin)
Requirements:
· Java