Convert Figma logo to code with AI

flairNLP logoflair

A very simple framework for state-of-the-art Natural Language Processing (NLP)

13,806
2,090
13,806
112

Top Related Projects

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

29,635

💫 Industrial-strength Natural Language Processing (NLP) in Python

13,411

NLTK Source

7,206

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

15,551

Topic Modelling for Humans

11,728

An open-source NLP research library, built on PyTorch.

Quick Overview

Flair is a powerful NLP library built on PyTorch. It offers state-of-the-art performance for various natural language processing tasks, including named entity recognition, part-of-speech tagging, and text classification. Flair is designed to be intuitive and easy to use, making it accessible for both researchers and practitioners.

Pros

  • Simple and intuitive API for various NLP tasks
  • State-of-the-art performance on many benchmarks
  • Supports multiple languages and pre-trained models
  • Easy integration with other popular NLP libraries

Cons

  • Can be slower compared to some other NLP libraries
  • Requires more memory for some tasks due to its use of contextual string embeddings
  • Limited documentation for advanced use cases
  • Smaller community compared to more established NLP libraries

Code Examples

  1. Named Entity Recognition:
from flair.data import Sentence
from flair.models import SequenceTagger

# Load pre-trained model
tagger = SequenceTagger.load('ner')

# Create a sentence
sentence = Sentence('George Washington went to Washington.')

# Predict NER tags
tagger.predict(sentence)

# Print the tagged sentence
print(sentence.to_tagged_string())
  1. Text Classification:
from flair.data import Sentence
from flair.models import TextClassifier

# Load pre-trained sentiment model
classifier = TextClassifier.load('en-sentiment')

# Create a sentence
sentence = Sentence('I love this movie!')

# Predict sentiment
classifier.predict(sentence)

# Print the predicted label and score
print(f'Sentiment: {sentence.labels[0].value} ({sentence.labels[0].score:.4f})')
  1. Part-of-Speech Tagging:
from flair.data import Sentence
from flair.models import SequenceTagger

# Load pre-trained POS tagger
tagger = SequenceTagger.load('pos')

# Create a sentence
sentence = Sentence('The quick brown fox jumps over the lazy dog.')

# Predict POS tags
tagger.predict(sentence)

# Print the tagged sentence
print(sentence.to_tagged_string())

Getting Started

To get started with Flair, follow these steps:

  1. Install Flair using pip:
pip install flair
  1. Import the necessary modules and create a Sentence object:
from flair.data import Sentence
from flair.models import SequenceTagger

sentence = Sentence('Your text here')
  1. Load a pre-trained model and make predictions:
tagger = SequenceTagger.load('ner')
tagger.predict(sentence)
  1. Access the predictions:
for entity in sentence.get_spans('ner'):
    print(f'{entity.text} ({entity.tag})')

Competitor Comparisons

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Pros of Transformers

  • Extensive model support: Offers a wide range of pre-trained models and architectures
  • Active community and frequent updates: Regular releases with new features and improvements
  • Seamless integration with PyTorch and TensorFlow: Supports multiple deep learning frameworks

Cons of Transformers

  • Steeper learning curve: More complex API and configuration options
  • Higher computational requirements: Many models are large and resource-intensive
  • Less focus on traditional NLP tasks: Primarily centered around transformer-based models

Code Comparison

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger

tagger = SequenceTagger.load('ner')
sentence = Sentence('John Doe works at Microsoft.')
tagger.predict(sentence)

Transformers:

from transformers import pipeline

nlp = pipeline("ner")
text = "John Doe works at Microsoft."
result = nlp(text)

Both libraries offer powerful NLP capabilities, but Flair focuses on simplicity and traditional NLP tasks, while Transformers provides a broader range of state-of-the-art models and architectures. Choose based on your specific needs and expertise level.

29,635

💫 Industrial-strength Natural Language Processing (NLP) in Python

Pros of spaCy

  • Faster processing speed and better performance for large-scale text processing
  • More comprehensive out-of-the-box features, including pre-trained models for various languages
  • Extensive documentation and community support

Cons of spaCy

  • Less flexibility in model customization compared to Flair
  • Steeper learning curve for beginners due to its more complex architecture
  • Larger memory footprint, which may be a concern for resource-constrained environments

Code Comparison

spaCy:

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")

for ent in doc.ents:
    print(ent.text, ent.label_)

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger

tagger = SequenceTagger.load("ner")
sentence = Sentence("Apple is looking at buying U.K. startup for $1 billion")
tagger.predict(sentence)

for entity in sentence.get_spans("ner"):
    print(entity.text, entity.tag)

Both libraries offer powerful NLP capabilities, with spaCy focusing on speed and out-of-the-box features, while Flair provides more flexibility in model customization and easier integration with PyTorch.

13,411

NLTK Source

Pros of NLTK

  • Comprehensive library with a wide range of NLP tools and resources
  • Extensive documentation and educational materials
  • Large and active community support

Cons of NLTK

  • Slower performance compared to more modern libraries
  • Less focus on deep learning and state-of-the-art models

Code Comparison

NLTK:

import nltk
from nltk import word_tokenize, pos_tag

text = "NLTK is a leading platform for building Python programs to work with human language data."
tokens = word_tokenize(text)
pos_tags = pos_tag(tokens)

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger

sentence = Sentence("Flair is a powerful NLP library.")
tagger = SequenceTagger.load('pos')
tagger.predict(sentence)

Summary

NLTK is a well-established, comprehensive NLP library with extensive documentation and community support. It offers a wide range of traditional NLP tools and resources. However, it may have slower performance and less focus on modern deep learning techniques compared to Flair.

Flair, on the other hand, is a more recent library that emphasizes state-of-the-art models and deep learning approaches. It offers simpler APIs and potentially better performance for certain tasks, but may have a smaller community and fewer traditional NLP tools compared to NLTK.

7,206

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Pros of Stanza

  • Supports a wider range of languages (over 60) compared to Flair
  • Offers more comprehensive linguistic annotations, including dependency parsing and named entity recognition
  • Provides pre-trained models for many languages, reducing the need for custom training

Cons of Stanza

  • Generally slower processing speed than Flair
  • Requires more memory and computational resources
  • Less flexible for custom task-specific models compared to Flair's modular approach

Code Comparison

Stanza:

import stanza
nlp = stanza.Pipeline('en')
doc = nlp("Hello world!")
print([(word.text, word.upos) for sent in doc.sentences for word in sent.words])

Flair:

from flair.data import Sentence
from flair.models import SequenceTagger
tagger = SequenceTagger.load('pos')
sentence = Sentence("Hello world!")
tagger.predict(sentence)
print([(token.text, token.tag) for token in sentence])

Both libraries offer straightforward APIs for common NLP tasks, but Stanza provides more comprehensive linguistic annotations out-of-the-box, while Flair focuses on flexibility and ease of use for specific tasks like part-of-speech tagging and named entity recognition.

15,551

Topic Modelling for Humans

Pros of Gensim

  • More extensive topic modeling capabilities, including LDA and LSI
  • Better suited for large-scale text processing and document similarity tasks
  • Offers efficient streaming corpus processing for handling large datasets

Cons of Gensim

  • Less focused on deep learning-based NLP tasks
  • Fewer built-in options for named entity recognition and part-of-speech tagging
  • Limited support for sequence labeling tasks

Code Comparison

Gensim (topic modeling):

from gensim import corpora, models
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
lda_model = models.LdaMulticore(corpus=corpus, num_topics=10)

Flair (named entity recognition):

from flair.data import Sentence
from flair.models import SequenceTagger
tagger = SequenceTagger.load('ner')
sentence = Sentence('John Doe works at Microsoft.')
tagger.predict(sentence)

Both libraries offer powerful NLP capabilities, but they focus on different aspects. Gensim excels in topic modeling and document similarity, while Flair specializes in deep learning-based NLP tasks like named entity recognition and sequence labeling.

11,728

An open-source NLP research library, built on PyTorch.

Pros of AllenNLP

  • More comprehensive and feature-rich, offering a wider range of NLP tasks and models
  • Better documentation and tutorials, making it easier for beginners to get started
  • Stronger integration with PyTorch and support for more advanced deep learning techniques

Cons of AllenNLP

  • Steeper learning curve due to its complexity and extensive feature set
  • Heavier and more resource-intensive, which may not be ideal for smaller projects
  • Less focus on simplicity and ease of use for quick prototyping

Code Comparison

AllenNLP:

from allennlp.predictors.predictor import Predictor

predictor = Predictor.from_path("https://storage.googleapis.com/allennlp-public-models/bert-base-srl-2020.03.24.tar.gz")
result = predictor.predict(sentence="The cat sat on the mat.")

Flair:

from flair.models import SequenceTagger
from flair.data import Sentence

tagger = SequenceTagger.load('ner')
sentence = Sentence('The cat sat on the mat.')
tagger.predict(sentence)

Both repositories are popular NLP frameworks, but they cater to different needs. AllenNLP is more comprehensive and suitable for complex research projects, while Flair focuses on simplicity and ease of use for quick prototyping and smaller-scale applications.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

alt text alt text

PyPI version GitHub Issues Contributions welcome License: MIT

A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends.


Flair is:

  • A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), sentiment analysis, part-of-speech tagging (PoS), special support for biomedical texts, sense disambiguation and classification, with support for a rapidly growing number of languages.

  • A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings and various transformers.

  • A PyTorch NLP framework. Our framework builds directly on PyTorch, making it easy to train your own models and experiment with new approaches using Flair embeddings and classes.

Now at version 0.14.0!

State-of-the-Art Models

Flair ships with state-of-the-art models for a range of NLP tasks. For instance, check out our latest NER models:

LanguageDatasetFlairBest publishedModel card & demo
EnglishConll-03 (4-class)94.0994.3 (Yamada et al., 2020)Flair English 4-class NER demo
EnglishOntonotes (18-class)90.9391.3 (Yu et al., 2020)Flair English 18-class NER demo
GermanConll-03 (4-class)92.3190.3 (Yu et al., 2020)Flair German 4-class NER demo
DutchConll-03 (4-class)95.2593.7 (Yu et al., 2020)Flair Dutch 4-class NER demo
SpanishConll-03 (4-class)90.5490.3 (Yu et al., 2020)Flair Spanish 4-class NER demo

Many Flair sequence tagging models (named entity recognition, part-of-speech tagging etc.) are also hosted on the 🤗 Hugging Face model hub! You can browse models, check detailed information on how they were trained, and even try each model out online!

Quick Start

Requirements and Installation

In your favorite virtual environment, simply do:

pip install flair

Flair requires Python 3.8+.

Example 1: Tag Entities in Text

Let's run named entity recognition (NER) over an example sentence. All you need to do is make a Sentence, load a pre-trained model and use it to predict tags for the sentence:

from flair.data import Sentence
from flair.nn import Classifier

# make a sentence
sentence = Sentence('I love Berlin .')

# load the NER tagger
tagger = Classifier.load('ner')

# run NER over sentence
tagger.predict(sentence)

# print the sentence with all annotations
print(sentence)

This should print:

Sentence: "I love Berlin ." → ["Berlin"/LOC]

This means that "Berlin" was tagged as a location entity in this sentence.

  • to learn more about NER tagging in Flair, check out our NER tutorial!

Example 2: Detect Sentiment

Let's run sentiment analysis over an example sentence to determine whether it is POSITIVE or NEGATIVE. Same code as above, just a different model:

from flair.data import Sentence
from flair.nn import Classifier

# make a sentence
sentence = Sentence('I love Berlin .')

# load the NER tagger
tagger = Classifier.load('sentiment')

# run NER over sentence
tagger.predict(sentence)

# print the sentence with all annotations
print(sentence)

This should print:

Sentence[4]: "I love Berlin ." → POSITIVE (0.9983)

This means that the sentence "I love Berlin" was tagged as having POSITIVE sentiment.

Tutorials

On our new :fire: Flair documentation page you will find many tutorials to get you started!

In particular:

There is also a dedicated landing page for our biomedical NER and datasets with installation instructions and tutorials.

More Documentation

Another great place to start is the book Natural Language Processing with Flair and its accompanying code repository, though it was written for an older version of Flair and some examples may no longer work.

There are also good third-party articles and posts that illustrate how to use Flair:

Citing Flair

Please cite the following paper when using Flair embeddings:

@inproceedings{akbik2018coling,
  title={Contextual String Embeddings for Sequence Labeling},
  author={Akbik, Alan and Blythe, Duncan and Vollgraf, Roland},
  booktitle = {{COLING} 2018, 27th International Conference on Computational Linguistics},
  pages     = {1638--1649},
  year      = {2018}
}

If you use the Flair framework for your experiments, please cite this paper:

@inproceedings{akbik2019flair,
  title={{FLAIR}: An easy-to-use framework for state-of-the-art {NLP}},
  author={Akbik, Alan and Bergmann, Tanja and Blythe, Duncan and Rasul, Kashif and Schweter, Stefan and Vollgraf, Roland},
  booktitle={{NAACL} 2019, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)},
  pages={54--59},
  year={2019}
}

If you use our new "FLERT" models or approach, please cite this paper:

@misc{schweter2020flert,
    title={{FLERT}: Document-Level Features for Named Entity Recognition},
    author={Stefan Schweter and Alan Akbik},
    year={2020},
    eprint={2011.06993},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

If you use our TARS approach for few-shot and zero-shot learning, please cite this paper:

@inproceedings{halder2020coling,
  title={Task Aware Representation of Sentences for Generic Text Classification},
  author={Halder, Kishaloy and Akbik, Alan and Krapac, Josip and Vollgraf, Roland},
  booktitle = {{COLING} 2020, 28th International Conference on Computational Linguistics},
  year      = {2020}
}

Contact

Please email your questions or comments to Alan Akbik.

Contributing

Thanks for your interest in contributing! There are many ways to get involved; start with our contributor guidelines and then check these open issues for specific tasks.

License

The MIT License (MIT)

Flair is licensed under the following MIT license: The MIT License (MIT) Copyright © 2018 Zalando SE, https://tech.zalando.com

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.