OBSearch is a distributed similarity search index. Similarity search is required in many areas. For example, music matching and binary program matching require a similarity search engine.
Nowadays, it is common to hear news of projects like "photosynth" that heavily rely on similarity search. OBSearch is a similarity search engine that can help you to create a new and interesting application!
Examples of things you can do with OBSearch:
· Match programs and help to detect Open Source/Libre license violations.
· Find music that sounds like Sisters of Mercy.
· Match huge vectors of randomly generated integers just for fun.
Here are some key features of "OBSearch":
· Single-computer or distributed mode.
· Designed to handle efficiently heavy objects (trees, graphs).
· The API is compact and easy to understand.
· Stability and scalability: OBSearch's secondary storage backend is Oracle's Berkeley DB. An extensive test suite makes sure that data integrity is preserved.
· Cutting edge: We strive to put together the latest algorithms the scientific community has to offer. For example, we use a "Mersenne Twister" random function, and also the K-means++ algorithm. Even some unpublished work on pivot selection strategies is included!