TextBlob

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.

9,410

1,168

9,410

View on GitHub

Top Related Projects

spaCy

31,840

💫 Industrial-strength Natural Language Processing (NLP) in Python

transformers

146,142

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

fastText

26,297

Library for fast text representation and classification.

stanza

7,500

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Quick Overview

TextBlob is a Python library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

Pros

Easy to use with a simple and intuitive API
Provides a wide range of NLP functionalities out of the box
Built on top of NLTK and pattern, leveraging their powerful features
Suitable for both beginners and experienced developers

Cons

May not be as performant or accurate as more specialized NLP libraries
Limited customization options for advanced use cases
Relies on external services for some features (e.g., translation)
Not actively maintained, with the last major release in 2019

Code Examples

Sentiment Analysis:

from textblob import TextBlob

text = "I love this amazing library!"
blob = TextBlob(text)
print(blob.sentiment)  # Output: Sentiment(polarity=0.8, subjectivity=0.9)

Part-of-Speech Tagging:

from textblob import TextBlob

text = "TextBlob is a powerful NLP library for Python."
blob = TextBlob(text)
print(blob.tags)  # Output: [('TextBlob', 'NNP'), ('is', 'VBZ'), ('a', 'DT'), ('powerful', 'JJ'), ('NLP', 'NNP'), ('library', 'NN'), ('for', 'IN'), ('Python', 'NNP')]

Language Translation:

from textblob import TextBlob

text = "Hello, how are you?"
blob = TextBlob(text)
translated = blob.translate(to='es')
print(translated)  # Output: ¿Hola, cómo estás?

Getting Started

To get started with TextBlob, follow these steps:

Install TextBlob:

pip install textblob

Download the required NLTK corpora:

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

Use TextBlob in your Python script:

from textblob import TextBlob

text = "TextBlob is awesome!"
blob = TextBlob(text)
print(blob.sentiment)

This will output the sentiment analysis result for the given text.

Competitor Comparisons

nltk

14,217

NLTK Source

Pros of NLTK

More comprehensive and feature-rich, offering a wider range of NLP tools and algorithms
Extensive documentation and academic resources available
Larger community and longer development history

Cons of NLTK

Steeper learning curve and more complex API
Slower performance for some tasks compared to TextBlob
Requires separate downloads for certain features and corpora

Code Comparison

NLTK:

import nltk
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag

text = "NLTK is a powerful NLP library."
tokens = word_tokenize(text)
pos_tags = pos_tag(tokens)

TextBlob:

from textblob import TextBlob

text = "TextBlob is a simpler NLP library."
blob = TextBlob(text)
tokens = blob.words
pos_tags = blob.tags

TextBlob provides a more straightforward API for common NLP tasks, while NLTK offers more granular control and a wider range of functionalities. TextBlob is built on top of NLTK, making it easier to use for basic tasks but less flexible for advanced applications. NLTK is better suited for research and complex NLP projects, while TextBlob is ideal for quick prototyping and simpler use cases.

spaCy

31,840

💫 Industrial-strength Natural Language Processing (NLP) in Python

Pros of spaCy

More comprehensive and advanced NLP capabilities, including named entity recognition and dependency parsing
Faster processing speed, especially for large volumes of text
Actively maintained with regular updates and a larger community

Cons of spaCy

Steeper learning curve and more complex setup
Larger memory footprint and longer initial loading time
Requires more computational resources

Code Comparison

TextBlob:

from textblob import TextBlob

text = "The quick brown fox jumps over the lazy dog."
blob = TextBlob(text)
print(blob.noun_phrases)
print(blob.sentiment)

spaCy:

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("The quick brown fox jumps over the lazy dog.")
print([chunk.text for chunk in doc.noun_chunks])
print(doc.sentiment)

Both libraries offer similar basic functionality, but spaCy provides more detailed linguistic information and advanced NLP features. TextBlob is simpler to use and more lightweight, making it suitable for quick text processing tasks. spaCy is better suited for complex NLP projects requiring in-depth analysis and scalability.

transformers

146,142

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Pros of transformers

Offers state-of-the-art pre-trained models for various NLP tasks
Supports a wide range of deep learning architectures (BERT, GPT, etc.)
Provides easy-to-use APIs for fine-tuning and transfer learning

Cons of transformers

Steeper learning curve and more complex implementation
Requires more computational resources for training and inference
May be overkill for simple NLP tasks that TextBlob can handle efficiently

Code comparison

TextBlob:

from textblob import TextBlob

text = "I love this product!"
blob = TextBlob(text)
sentiment = blob.sentiment.polarity

transformers:

from transformers import pipeline

sentiment_analyzer = pipeline("sentiment-analysis")
result = sentiment_analyzer("I love this product!")[0]
sentiment = result["score"] if result["label"] == "POSITIVE" else -result["score"]

Summary

TextBlob is simpler and more lightweight, suitable for basic NLP tasks. transformers offers more advanced capabilities and state-of-the-art models but requires more resources and expertise. Choose based on your project's complexity and requirements.

gensim

16,122

Topic Modelling for Humans

Pros of Gensim

More advanced and comprehensive topic modeling and document similarity features
Better performance and scalability for large datasets
Extensive documentation and active community support

Cons of Gensim

Steeper learning curve due to more complex API
Requires more computational resources for advanced operations
Less suitable for simple NLP tasks like sentiment analysis or part-of-speech tagging

Code Comparison

TextBlob example:

from textblob import TextBlob

text = "TextBlob is simple and easy to use."
blob = TextBlob(text)
print(blob.sentiment)

Gensim example:

from gensim import corpora, models

texts = [["text", "blob", "simple"], ["gensim", "advanced", "modeling"]]
dictionary = corpora.Dictionary(texts)
corpus = [dictionary.doc2bow(text) for text in texts]
lda_model = models.LdaModel(corpus, num_topics=2, id2word=dictionary)

TextBlob is more straightforward for basic NLP tasks, while Gensim offers more advanced features for topic modeling and document similarity. TextBlob is easier to use for beginners, but Gensim provides more powerful tools for complex NLP applications and large-scale text processing.

fastText

26,297

Library for fast text representation and classification.

Pros of fastText

Highly efficient for large-scale text classification and word representation learning
Supports multiple languages and can handle out-of-vocabulary words
Offers pre-trained models and embeddings for various languages

Cons of fastText

Steeper learning curve and more complex setup compared to TextBlob
Requires more computational resources for training and using large models
Less suitable for quick, simple NLP tasks that don't require advanced features

Code Comparison

TextBlob:

from textblob import TextBlob

text = "I love this product!"
blob = TextBlob(text)
sentiment = blob.sentiment.polarity

fastText:

import fasttext

model = fasttext.load_model("sentiment_model.bin")
text = "I love this product!"
prediction = model.predict(text)

TextBlob provides a more straightforward API for basic NLP tasks, while fastText offers more advanced features and customization options. TextBlob is better suited for quick sentiment analysis and simple language processing, whereas fastText excels in large-scale text classification and word embedding tasks.

Both libraries have their strengths, and the choice between them depends on the specific requirements of your project, such as scale, complexity, and performance needs.

stanza

7,500

Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages

Pros of Stanza

More advanced and accurate NLP capabilities, including dependency parsing and named entity recognition
Supports a wider range of languages (over 60)
Utilizes state-of-the-art deep learning models for improved performance

Cons of Stanza

Steeper learning curve and more complex API
Requires more computational resources and memory
Slower processing speed compared to TextBlob

Code Comparison

TextBlob:

from textblob import TextBlob

text = "I love this product. It's amazing!"
blob = TextBlob(text)
sentiment = blob.sentiment.polarity

Stanza:

import stanza

nlp = stanza.Pipeline('en')
doc = nlp("I love this product. It's amazing!")
sentiment = doc.sentences[0].sentiment

Both libraries offer sentiment analysis, but Stanza provides more detailed linguistic analysis and supports a wider range of NLP tasks. TextBlob is simpler to use and more lightweight, making it suitable for basic NLP tasks and rapid prototyping. Stanza, on the other hand, offers more advanced features and better accuracy, but requires more setup and computational resources.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

TextBlob: Simplified Text Processing

.. image:: https://badgen.net/pypi/v/TextBlob :target: https://pypi.org/project/textblob/ :alt: Latest version

.. image:: https://github.com/sloria/TextBlob/actions/workflows/build-release.yml/badge.svg :target: https://github.com/sloria/TextBlob/actions/workflows/build-release.yml :alt: Build status

Homepage: https://textblob.readthedocs.io/ <https://textblob.readthedocs.io/>_

TextBlob is a Python library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, and more.

.. code-block:: python

from textblob import TextBlob

text = """
The titular threat of The Blob has always struck me as the ultimate movie
monster: an insatiably hungry, amoeba-like mass able to penetrate
virtually any safeguard, capable of--as a doomed doctor chillingly
describes it--"assimilating flesh on contact.
Snide comparisons to gelatin be damned, it's a concept with the most
devastating of potential consequences, not unlike the grey goo scenario
proposed by technological theorists fearful of
artificial intelligence run rampant.
"""

blob = TextBlob(text)
blob.tags  # [('The', 'DT'), ('titular', 'JJ'),
#  ('threat', 'NN'), ('of', 'IN'), ...]

blob.noun_phrases  # WordList(['titular threat', 'blob',
#            'ultimate movie monster',
#            'amoeba-like mass', ...])

for sentence in blob.sentences:
    print(sentence.sentiment.polarity)
# 0.060
# -0.341

TextBlob stands on the giant shoulders of NLTK_ and pattern_, and plays nicely with both.

Features

Noun phrase extraction
Part-of-speech tagging
Sentiment analysis
Classification (Naive Bayes, Decision Tree)
Tokenization (splitting text into words and sentences)
Word and phrase frequencies
Parsing
n-grams
Word inflection (pluralization and singularization) and lemmatization
Spelling correction
Add new models or languages through extensions
WordNet integration

Get it now

$ pip install -U textblob
$ python -m textblob.download_corpora

Examples

See more examples at the Quickstart guide_.

.. _Quickstart guide: https://textblob.readthedocs.io/en/latest/quickstart.html#quickstart

Documentation

Full documentation is available at https://textblob.readthedocs.io/.

Project Links

Docs: https://textblob.readthedocs.io/
Changelog: https://textblob.readthedocs.io/en/latest/changelog.html
PyPI: https://pypi.python.org/pypi/TextBlob
Issues: https://github.com/sloria/TextBlob/issues

License

MIT licensed. See the bundled LICENSE <https://github.com/sloria/TextBlob/blob/master/LICENSE>_ file for more details.

.. _pattern: https://github.com/clips/pattern/ .. _NLTK: http://nltk.org/

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot