Sally 0.9.0

A tool for embedding strings
Sally is a simple, easy to use, small and open source tool for mapping a set of strings to a set of vectors.

This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data.

Sally can applied to several types of string data, such as text documents, DNA sequences or log files, where it can handle common formats such as directories, archives and text files of string data.

Sally implements a standard technique for mapping strings to a vector space that is often referred to as vector space model or bag-of-words model.

The strings are characterized by a set of features, where each feature is associated with one dimension of the vector space.

The following types of features are supported by Sally: bytes, words, n-grams of bytes and n-grams of words.

NOTE: Detailed installation instructions can be accessed HERE.

last updated on:
July 4th, 2014, 2:32 GMT
file size:
582 KB
developed by:
Konrad Rieck
license type:
operating system(s):
Mac OS X
binary format:
Home \ Developer Tools


In a hurry? Add it to your Download Basket!

user rating



What's New in This Release:
  • first implementation of blended n-grams
  • updated docs and fixed printing of config
  • some badly needed code beautification
  • make use of CONFIG_TYPE_BOOL and introduce program switches as replacement for program arguments expecting 0 & 1 as value.
read full changelog

Add your review!