GATE (General Architecture for Text Engineering) is a fremowork, architecture and development environment, for evaluating, embedding and developing Human Language Technology.
GATE is an infrastructure for developing and deploying software components that process human language. GATE helps scientists and developers in three ways:
by specifiying an architecture, or organisational structure, for language processing
by providing a framework, or class library, that implements the architecture and can be used to embed language processing capabilities in diverse applications;
by providing a development environment built on top of the framework made up
of convenient graphical tools for developing components.
GATE as an architecture suggests that the elements of software systems that process natural language can usefully be broken down into various types of component, known as resources4.
Components are reusable software chunks with well-defined interfaces, and are a popular architectural form, used in Sun's Java Beans and Microsoft's .Net, for example. GATE components are specialised types of Java Bean, and come in three flavours:
LanguageResources (LRs) represent entities such as lexicons, corpora or ontologies;
ProcessingResources (PRs) represent entities that are primarily algorithmic, such as parsers, generators or ngram modellers;
VisualResources (VRs) represent visualisation and editing components that participate in GUIs.