Released under the MIT license.

SpaCy is a library for advanced Natural Language Processing in Python and Cython. It’s built on the very latest research, and was designed from day one to be used in real products.

SpaCy comes with pretrained pipelines and currently supports tokenization and training for 60+ languages. It features state-of-the-art speed and neural network models for tagging, parsing, named entity recognition, text classification and more, multi-task learning with pretrained transformers like BERT, as well as a production-ready training system and easy model packaging, deployment and workflow management.


Digital platforms: Linux , Windows , macOS , OSX

Versions: Cloud/On-Premise 

Use cases

ADAM: Question Answering System

A question answering system that extracts answers from Wikipedia to questions posed in natural language.


Support for 60+ languages.

Trained pipelines for different languages and tasks.

Multi-task learning with pretrained transformers like BERT.

Support for pretrained word vectors and embeddings.

State-of-the-art speed.

Production-ready training system.

Linguistically-motivated tokenization.

Components for named entity recognition, part-of-speech-tagging, dependency parsing, sentence segmentation, text classification, lemmatization, morphological analysis, entity linking and more.

Easily extensible with custom components and attributes.

Support for custom models in PyTorch, TensorFlow and other frameworks.

Built in visualizers for syntax and NER.

Easy model packaging, deployment and workflow management.

Robust, rigorously evaluated accuracy.