LGPL licence.

GATE is used to perform tasks where meaningful text content needs to be detected and encoded in a structured form by adding annotations to text segments. GATE is used along with NLTK, R and RapidMiner.

The system is used for information extraction, manual and automatic semantic annotation, coreferentiality analysis, operations with ontologies (e.g. WordNet), machine learning (Weka, RASP, MAXENT, SVM Light), blog post flow analysis.

Clients: Twitter.


Digital platforms: Cross-platform software

Versions: Cloud/On-Premise 

Use cases

Some recent projects we worked on:


Collective Platform for Community Resilience and Social Innovation during Crises.


Knowledge in the Making in the European Community.

  • SoBigData:

European Research Infrastructure for Social Media Mining and Big Data.

  • RISIS:

Research Infrastructure for Research and Innovation Policy Studies.

  • WeVerify:

Wider and Enhanced Verification. European Language Grid.


The GATE architecture consists of interconnected components: “pieces” of software with clearly defined interfaces that can be deployed in different contexts.

GATE implements out-of-the-box solutions for tokenization, tagging, text-to-speech splitting (splitter), named entity extraction, and machine learning.

The components are divided into three categories by their functions:

Language Resources (LR), Processing Resources (PR) – document processing programs (resources), Visual Resources (VR).