ClearTK 1.4.1

A toolkit for developing statistical natural language processing components in Java
The ClearTK toolkit is based on the Apache UIMA framework for text analysis.

ClearTK is a project developed at the Center for Computational Language and Education Research (CLEAR) at the University of Colorado at Boulder.

In a nutshell, ClearTK provides a framework for developing statistical natural language processing (NLP) components in Java and it provides two libraries: ClearTK-framework and ClearTK-toolkit.

Main features:

  • A common interface and wrappers for popular machine learning libraries such as SVMlight, LIBSVM, OpenNLP MaxEnt, and Mallet.
  • A rich feature extraction library that can be used with any of the machine learning classifiers. Under the covers, ClearTK understands each of the native machine learning libraries and translates your features into a format appropriate to whatever model you're using.
  • Infrastructure for creating NLP components for specific tasks such as part-of-speech tagging, BIO-style chunking, named entity recognition, semantic role labeling, temporal relation tagging, etc.
  • Wrappers for common NLP tools such as the Snowball stemmer, the OpenNLP tools, the MaltParser dependency parser, and the Stanford CoreNLP tools.
  • Corpus readers for collections like the Penn Treebank, ACE 2005, CoNLL 2003, Genia, TimeBank and TempEval.

last updated on:
June 3rd, 2013, 3:54 GMT
file size:
177.6 MB
license type:
developed by:
ClearTK Team
operating system(s):
Mac OS X
binary format:
Home \ Development \ Java
Download Button

In a hurry? Add it to your Download Basket!

user rating



Rate it!

Add your review!