biotoolbox is a collection of miscellaneous open-source perl scripts that use bioperl modules for use in bioinformatics analysis.
The tools are included for processing are microarray data, next generation sequencing data, data file format conversion, querying datasets, and general high level analysis of datasets.
This tool box of programs relies on storing genome annotation, microarray, and next generation sequencing data in local bioperl databases, allowing for data retrieval relative to any annotated feature in the database.
While referencing genomic annotation and features from a database are convenient, they are not required. Simple Bed style input files are also supported for data collection.
Also included are programs for converting and importing data from UCSC gene tables and ensEMBL, as well as a variety of other formats, into a GFF3 file that can be loaded into a bioperl database.
Detailed instructions on how to install and use the biotoolbox utility on your Mac are available HERE.
What's New in This Release: [ read full changelog ]
· Major improvements to performance of some data collection scripts by adding multi-threaded options. These include get_datasets.pl, get_relative_data.pl, average_gene.pl, and bam2wig.pl. The number of CPU forks may be specified with the --cpu option (default 2). This option requires the installation of Parallel::ForkManager, available through CPAN. Run the check_dependencies.pl script to install it.
· All gzip compression read and writes are now forked through an external gzip utility for a considerable boost in performance (2-5X). The gzip executable must be in your path for this to work (it usually is on most Unix-like environments).
· Added --long option when collecting data from long features in script average_gene.pl.
· Improved efficiency when collecting data from very large windows in both get_relative_data.pl and average_gene.pl.
· Summing the total number of read alignments in Bam files is also multi-threaded. Summing the total number of intervals in a BigBed file is also improved.
· F...