PyML Changelog

What's new in PyML 0.7.13.2

Sep 21, 2013
  • Fixed ridgeRegression.RidgeRegression - the regression part wasn't working correctly.
  • Added ; as a possible delimiter for parsing csv files.
  • Added mean absolute deviation (MAD) to regression results statistics
  • Updated the PyVectorDataSet container so that the Standardizer object works correctly

New in PyML 0.7.13.1 (Sep 10, 2013)

  • When constructing PyVectorDataSet from a list/array, it verifies/converts the input to a numpy array

New in PyML 0.7.13 (Sep 9, 2013)

  • fixed roc.plotROCs that plots multiple ROC curves given a list/dictionary of results objects
  • The confusion matrix has been transposed, where rows corresponds to the true labels, and columns to predicted lables. This change is for consistency with how it's commonly used.

New in PyML 0.7.12 (Aug 19, 2013)

  • Containers can now be indexed with either integers or lists/numpy
  • arrays: data[i] will return example i, and data[I], where I is a
  • list or array will create a copy of the data that includes those
  • examples.
  • ridgeRegression.RidgeRegression can now be used for classification and
  • for regression (previously only classification).The training has
  • been speeded up as well.
  • Added vectorDatasets.PyVectorDataSet which is a container that holds
  • its data in a Numpy array.This is useful for writing classifiers in
  • pure python

New in PyML 0.7.9 (Oct 8, 2011)

  • Fixed compilation issue under gcc 4.6.1 (tracker ID: 3361193)
  • Fixed issue with VectorDataSet (tracker ID: 3364921)
  • Updated tutorial

New in PyML 0.7.8 (Jun 24, 2011)

  • Added wrapper for liblinear linear SVMs. If you only need a linear
  • SVM, these solvers offer a very significant speedup for large
  • datasets.
  • Usage:
  • SVM(optimizer = 'liblinear', loss = 'l2') for l2 loss SVM
  • or SVM(optimizer = 'liblinear', loss = 'l1') for l1 loss SVM
  • Added containers.setData.SetData - a dataset container where each
  • example is a set of objects.
  • Chris Hiszpanski reported an issue and fix for ROC calculation that
  • would fail for a corner case.
  • When creating a dataset with numeric labels, the 'numericLabels'
  • keyword argument was not passed on to the Labels constructor when
  • creating such a dataset from arrays/lists
  • the "stratifiedCV" method of SVR was being called by model
  • selection. That was addressed by defining it to be regular
  • cross-validation (stratified CV doesn't make sense for regression).
  • Better solution would be to have a separate base class for regression.
  • Can create an empty SparseDataSet or VectorDataSet, and then
  • populate it on the fly with features (using its addFeatures method).

New in PyML 0.7.7 (Apr 7, 2011)

  • The classify method of the KNN classifier was broken
  • (cross-validation worked fine).
  • Fixed issue with the __repr__ of the Results objects that was giving
  • an error when results of an unlabeled dataset were being displayed.

New in PyML 0.7.6 (Feb 15, 2011)

  • Added positional kmer dataset creation (an implementation of Sonenburg et al's weighted degree kernel).

New in PyML 0.7.5 (Feb 11, 2011)

  • Reworked the SequenceData container
  • Fixed a bug in Labels.oneAgainstRest (thanks to Marcel Luethi for
  • bug report)
  • Fixed a bug in feature selection - feature IDs are now correctly retained

New in PyML 0.7.4.1 (Jun 24, 2010)

  • Compilation error in Kernel.h fixed (shows in gcc 4.3)

New in PyML 0.7.4 (Jun 17, 2010)

  • added k-means clustering (PyML.clusterers.kmeans)
  • corrected handling of nan's
  • fixed an import statement in classifiers/modelSelect.py
  • Compilation error in Kernel.h fixed (shows in gcc 4.3)
  • updated code to new version of numpy

New in PyML 0.7.2 (May 16, 2009)

  • Fixed bug in Aggregate container (in the case of a weighted combination of datasets) Bug report by Eithon Cadag - Stephen Picolo found issue when using feature selection with the VectorDataSet container. Use SparseDataSet instead.

New in PyML 0.7.1 (Nov 21, 2008)

  • added k-means clustering (PyML.clusterers.kmeans)
  • corrected handling of nan's
  • fixed an import statement in classifiers/modelSelect.py
  • Compilation error in Kernel.h fixed (shows in gcc 4.3)
  • updated code to new version of numpy
  • demo2d.scatter: improved interface to make it more useful

New in PyML 0.7.0 (Jun 12, 2008)

  • a restructuring of the module structure. see tutorial for details
  • small changes in demo2d
  • linear kernel wasn't normalizing properly
  • myio.myopen now handles bz2 files as well
  • myio.myopen does not open in universal newline support mode by default
  • it throws the pyml parser off
  • improved the way SequenceData reads fasta files
  • it can now extract labels using a user-supplied function that extracts the id and label out of the fasta header.
  • added a method for generating spectrum kernels (containers.sequenceData.spectrumData)