March 2nd, 2013Performance improvements:
· Support for multi-threading added
· Using NIOFSDirectory on all platforms except Windows
· New in-memory backend, faster than Lucene (experimental)
Changes to Comparators:
· Geo-coordinate comparator added.
· Q-grams comparator added.
· Levenshtein implementation is now faster
· Weighted Levenshtein weight estimator now knows position in string ( issue 81 )
Changes to Cleaners:
· Added PhoneNumberCleaner
· Extended and generalized regexp cleaner
· Removed sub-cleaner concept, added support for multiple cleaners
Other improvements:
· Implemented user control over lookup props
· Upgraded to Lucene 4.0
· Added MatchListener.startProcessing() callback
· Removed some MatchListener callback methods (weren't thread-safe)
· InMemoryLinkDatabase now complete and tested
· LinkDatabaseMatchListener bug fixes
· Better validation of configurations
· JDBCEquivalenceClassDatabase added
· RDBMSLinkDatabase performance improvement
Changes to command-line client:
· Added data debug mode
· Fixed bug with reusing link file as test file
· Added pretty-printing of records
· Better interactive debugging behaviour
· Improvements to DebugCompare tool
· Added performance profiling to command-line client
Bugs fixed:
· Issue 83 : Look up record by ID when ID is a URI.
· Issue 90 : Bug in command-line option parser.
· Bug in CSV data source fixed
September 17th, 2012Changes:
· A change to the calculation of property probabilities when values do not match exactly. This means that you may need to adjust the probabilities and thresholds in your applications.
· Upgraded to Lucene 3.6.1.
· Improvements to NorwegianCompanyNameCleaner and NorwegianAddressCleaner.
New Features:
· A weighted Levenshtein comparator.
· A Metaphone comparator.
· A Jaccard index comparator.
· A prototype of a comparator using a Norwegian version of Metaphone.
· A generic value cleaner.
· Support for setting objects as parameters of other objects.
September 12th, 2011· Refactored API to make it much more user-friendly.
· Documented API to same end.
· New comparators: NumericComparator, DiceCoefficientComparator, SoundexComparator
· A new record linkage mode which can be used to link records from different data sets.
· Numerous bug fixes and new test cases.
· Performance improvements in the Levenshtein comparator.
· Default cleaner now strips accents.
· Upgraded to Lucene 3.3.0.
· Version stamping in manifest file, API, and command-line client.