flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

14,239

2,118

14,239

View on GitHub

Top Related Projects

transformers

150,567

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

spaCy

32,582

💫 Industrial-strength Natural Language Processing (NLP) in Python

stanza

7,609

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

allennlp

11,862

An open-source NLP research library, built on PyTorch.

Quick Overview

Flair is a powerful NLP library built on PyTorch. It offers state-of-the-art performance for various natural language processing tasks, including named entity recognition, part-of-speech tagging, and text classification. Flair is designed to be intuitive and easy to use, making it accessible for both researchers and practitioners.

Pros

Simple and intuitive API for various NLP tasks
State-of-the-art performance on many benchmarks
Supports multiple languages and pre-trained models
Easy integration with other popular NLP libraries

Cons

Can be slower compared to some other NLP libraries
Requires more memory for some tasks due to its use of contextual string embeddings
Limited documentation for advanced use cases
Smaller community compared to more established NLP libraries

Code Examples

Named Entity Recognition:

from flair.data import Sentence
from flair.models import SequenceTagger

# Load pre-trained model
tagger = SequenceTagger.load('ner')

# Create a sentence
sentence = Sentence('George Washington went to Washington.')

# Predict NER tags
tagger.predict(sentence)

# Print the tagged sentence
print(sentence.to_tagged_string())

Text Classification:

from flair.data import Sentence
from flair.models import TextClassifier

# Load pre-trained sentiment model
classifier = TextClassifier.load('en-sentiment')

# Create a sentence
sentence = Sentence('I love this movie!')

# Predict sentiment
classifier.predict(sentence)

# Print the predicted label and score
print(f'Sentiment: {sentence.labels[0].value} ({sentence.labels[0].score:.4f})')

Part-of-Speech Tagging:

from flair.data import Sentence
from flair.models import SequenceTagger

# Load pre-trained POS tagger
tagger = SequenceTagger.load('pos')

# Create a sentence
sentence = Sentence('The quick brown fox jumps over the lazy dog.')

# Predict POS tags
tagger.predict(sentence)

# Print the tagged sentence
print(sentence.to_tagged_string())

Getting Started

To get started with Flair, follow these steps:

Install Flair using pip:

pip install flair

Import the necessary modules and create a Sentence object:

from flair.data import Sentence
from flair.models import SequenceTagger

sentence = Sentence('Your text here')

Load a pre-trained model and make predictions:

tagger = SequenceTagger.load('ner')
tagger.predict(sentence)

Access the predictions:

for entity in sentence.get_spans('ner'):
    print(f'{entity.text} ({entity.tag})')

Competitor Comparisons

transformers

150,567

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Pros of Transformers

Extensive model support: Offers a wide range of pre-trained models and architectures
Active community and frequent updates: Regular releases with new features and improvements
Seamless integration with PyTorch and TensorFlow: Supports multiple deep learning frameworks

Cons of Transformers

Steeper learning curve: More complex API and configuration options
Higher computational requirements: Many models are large and resource-intensive
Less focus on traditional NLP tasks: Primarily centered around transformer-based models

Code Comparison

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger

tagger = SequenceTagger.load('ner')
sentence = Sentence('John Doe works at Microsoft.')
tagger.predict(sentence)

Transformers:

from transformers import pipeline

nlp = pipeline("ner")
text = "John Doe works at Microsoft."
result = nlp(text)

Both libraries offer powerful NLP capabilities, but Flair focuses on simplicity and traditional NLP tasks, while Transformers provides a broader range of state-of-the-art models and architectures. Choose based on your specific needs and expertise level.

spaCy

32,582

💫 Industrial-strength Natural Language Processing (NLP) in Python

Pros of spaCy

Faster processing speed and better performance for large-scale text processing
More comprehensive out-of-the-box features, including pre-trained models for various languages
Extensive documentation and community support

Cons of spaCy

Less flexibility in model customization compared to Flair
Steeper learning curve for beginners due to its more complex architecture
Larger memory footprint, which may be a concern for resource-constrained environments

Code Comparison

spaCy:

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")

for ent in doc.ents:
    print(ent.text, ent.label_)

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger

tagger = SequenceTagger.load("ner")
sentence = Sentence("Apple is looking at buying U.K. startup for $1 billion")
tagger.predict(sentence)

for entity in sentence.get_spans("ner"):
    print(entity.text, entity.tag)

Both libraries offer powerful NLP capabilities, with spaCy focusing on speed and out-of-the-box features, while Flair provides more flexibility in model customization and easier integration with PyTorch.

nltk

14,217

NLTK Source

Pros of NLTK

Comprehensive library with a wide range of NLP tools and resources
Extensive documentation and educational materials
Large and active community support

Cons of NLTK

Slower performance compared to more modern libraries
Less focus on deep learning and state-of-the-art models

Code Comparison

NLTK:

import nltk
from nltk import word_tokenize, pos_tag

text = "NLTK is a leading platform for building Python programs to work with human language data."
tokens = word_tokenize(text)
pos_tags = pos_tag(tokens)

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger

sentence = Sentence("Flair is a powerful NLP library.")
tagger = SequenceTagger.load('pos')
tagger.predict(sentence)

Summary

NLTK is a well-established, comprehensive NLP library with extensive documentation and community support. It offers a wide range of traditional NLP tools and resources. However, it may have slower performance and less focus on modern deep learning techniques compared to Flair.

Flair, on the other hand, is a more recent library that emphasizes state-of-the-art models and deep learning approaches. It offers simpler APIs and potentially better performance for certain tasks, but may have a smaller community and fewer traditional NLP tools compared to NLTK.

stanza

7,609

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Pros of Stanza

Supports a wider range of languages (over 60) compared to Flair
Offers more comprehensive linguistic annotations, including dependency parsing and named entity recognition
Provides pre-trained models for many languages, reducing the need for custom training

Cons of Stanza

Generally slower processing speed than Flair
Requires more memory and computational resources
Less flexible for custom task-specific models compared to Flair's modular approach

Code Comparison

Stanza:

import stanza
nlp = stanza.Pipeline('en')
doc = nlp("Hello world!")
print([(word.text, word.upos) for sent in doc.sentences for word in sent.words])

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger
tagger = SequenceTagger.load('pos')
sentence = Sentence("Hello world!")
tagger.predict(sentence)
print([(token.text, token.tag) for token in sentence])

Both libraries offer straightforward APIs for common NLP tasks, but Stanza provides more comprehensive linguistic annotations out-of-the-box, while Flair focuses on flexibility and ease of use for specific tasks like part-of-speech tagging and named entity recognition.

gensim

16,122

Topic Modelling for Humans

Pros of Gensim

More extensive topic modeling capabilities, including LDA and LSI
Better suited for large-scale text processing and document similarity tasks
Offers efficient streaming corpus processing for handling large datasets

Cons of Gensim

Less focused on deep learning-based NLP tasks
Fewer built-in options for named entity recognition and part-of-speech tagging
Limited support for sequence labeling tasks

Code Comparison

Gensim (topic modeling):

from gensim import corpora, models
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
lda_model = models.LdaMulticore(corpus=corpus, num_topics=10)

Flair (named entity recognition):

from flair.data import Sentence
from flair.models import SequenceTagger
tagger = SequenceTagger.load('ner')
sentence = Sentence('John Doe works at Microsoft.')
tagger.predict(sentence)

Both libraries offer powerful NLP capabilities, but they focus on different aspects. Gensim excels in topic modeling and document similarity, while Flair specializes in deep learning-based NLP tasks like named entity recognition and sequence labeling.

allennlp

11,862

An open-source NLP research library, built on PyTorch.

Pros of AllenNLP

More comprehensive and feature-rich, offering a wider range of NLP tasks and models
Better documentation and tutorials, making it easier for beginners to get started
Stronger integration with PyTorch and support for more advanced deep learning techniques

Cons of AllenNLP

Steeper learning curve due to its complexity and extensive feature set
Heavier and more resource-intensive, which may not be ideal for smaller projects
Less focus on simplicity and ease of use for quick prototyping

Code Comparison

AllenNLP:

from allennlp.predictors.predictor import Predictor

predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz")
result = predictor.predict(sentence="The cat sat on the mat.")

Flair:

from flair.models import SequenceTagger
from flair.data import Sentence

tagger = SequenceTagger.load('ner')
sentence = Sentence('The cat sat on the mat.')
tagger.predict(sentence)

Both repositories are popular NLP frameworks, but they cater to different needs. AllenNLP is more comprehensive and suitable for complex research projects, while Flair focuses on simplicity and ease of use for quick prototyping and smaller-scale applications.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

alt text

A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends.

Flair is:

A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), sentiment analysis, part-of-speech tagging (PoS), special support for biomedical texts, sense disambiguation and classification, with support for a rapidly growing number of languages.
A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings and various transformers.
A PyTorch NLP framework. Our framework builds directly on PyTorch, making it easy to train your own models and experiment with new approaches using Flair embeddings and classes.

Now at version 0.15.1!

State-of-the-Art Models

Flair ships with state-of-the-art models for a range of NLP tasks. For instance, check out our latest NER models:

Language	Dataset	Flair	Best published	Model card & demo
English	Conll-03 (4-class)	94.09	94.3 (Yamada et al., 2020)	Flair English 4-class NER demo
English	Ontonotes (18-class)	90.93	91.3 (Yu et al., 2020)	Flair English 18-class NER demo
German	Conll-03 (4-class)	92.31	90.3 (Yu et al., 2020)	Flair German 4-class NER demo
Dutch	Conll-03 (4-class)	95.25	93.7 (Yu et al., 2020)	Flair Dutch 4-class NER demo
Spanish	Conll-03 (4-class)	90.54	90.3 (Yu et al., 2020)	Flair Spanish 4-class NER demo

Many Flair sequence tagging models (named entity recognition, part-of-speech tagging etc.) are also hosted on the ð¤ Hugging Face model hub! You can browse models, check detailed information on how they were trained, and even try each model out online!

Quick Start

Requirements and Installation

In your favorite virtual environment, simply do:

pip install flair

Flair requires Python 3.9+.

Example 1: Tag Entities in Text

Let's run named entity recognition (NER) over an example sentence. All you need to do is make a Sentence, load a pre-trained model and use it to predict tags for the sentence:

from flair.data import Sentence
from flair.nn import Classifier

# make a sentence
sentence = Sentence('I love Berlin .')

# load the NER tagger
tagger = Classifier.load('ner')

# run NER over sentence
tagger.predict(sentence)

# print the sentence with all annotations
print(sentence)

This should print:

Sentence: "I love Berlin ." â ["Berlin"/LOC]

This means that "Berlin" was tagged as a location entity in this sentence.

to learn more about NER tagging in Flair, check out our NER tutorial!

Example 2: Detect Sentiment

Let's run sentiment analysis over an example sentence to determine whether it is POSITIVE or NEGATIVE. Same code as above, just a different model:

from flair.data import Sentence
from flair.nn import Classifier

# make a sentence
sentence = Sentence('I love Berlin .')

# load the NER tagger
tagger = Classifier.load('sentiment')

# run NER over sentence
tagger.predict(sentence)

# print the sentence with all annotations
print(sentence)

This should print:

Sentence[4]: "I love Berlin ." â POSITIVE (0.9983)

This means that the sentence "I love Berlin" was tagged as having POSITIVE sentiment.

to learn more about sentiment analysis in Flair, check out our sentiment analysis tutorial!

Tutorials

On our new :fire: Flair documentation page you will find many tutorials to get you started!

In particular:

Tutorial 1: Basic tagging â how to tag your text
Tutorial 2: Training models â how to train your own state-of-the-art NLP models
Tutorial 3: Embeddings â how to produce embeddings for words and documents
Tutorial 4: Biomedical text â how to analyse biomedical text data

There is also a dedicated landing page for our biomedical NER and datasets with installation instructions and tutorials.

Citing Flair

Please cite the following paper when using Flair embeddings:

@inproceedings{akbik2018coling,
  title={Contextual String Embeddings for Sequence Labeling},
  author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
  booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
  pages     = {1638--1649},
  year      = {2018}
}

If you use the Flair framework for your experiments, please cite this paper:

@inproceedings{akbik2019flair,
  title={{FLAIR}: An easy-to-use framework for state-of-the-art {NLP}},
  author={Akbik, Alan and Bergmann, Tanja and Blythe, Duncan and Rasul, Kashif and Schweter, Stefan and Vollgraf, Roland},
  booktitle={{NAACL} 2019, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)},
  pages={54--59},
  year={2019}
}

If you use our new "FLERT" models or approach, please cite this paper:

@misc{schweter2020flert,
    title={{FLERT}: Document-Level Features for Named Entity Recognition},
    author={Stefan Schweter and Alan Akbik},
    year={2020},
    eprint={2011.06993},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

If you use our TARS approach for few-shot and zero-shot learning, please cite this paper:

@inproceedings{halder2020coling,
  title={Task Aware Representation of Sentences for Generic Text Classification},
  author={Halder, Kishaloy and Akbik, Alan and Krapac, Josip and Vollgraf, Roland},
  booktitle = {{COLING} 2020, 28th International Conference on Computational Linguistics},
  year      = {2020}
}

Contact

Please email your questions or comments to Alan Akbik.

Contributing

Thanks for your interest in contributing! There are many ways to get involved; start with our contributor guidelines and then check these open issues for specific tasks.

License

The MIT License (MIT)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the âSoftwareâ), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED âAS ISâ, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of Transformers

Cons of Transformers

Code Comparison

Pros of spaCy

Cons of spaCy

Code Comparison

Pros of NLTK

Cons of NLTK

Code Comparison

Summary

Pros of Stanza

Cons of Stanza

Code Comparison

Pros of Gensim

Cons of Gensim

Code Comparison

Pros of AllenNLP

Cons of AllenNLP

Code Comparison

Convert designs to code with AI

README

State-of-the-Art Models

Quick Start

Requirements and Installation

Example 1: Tag Entities in Text

Example 2: Detect Sentiment

Tutorials

More Documentation

Citing Flair

Contact

Contributing

Top Related Projects

Convert designs to code with AI