Convert Figma logo to code with AI

bfelbo logoDeepMoji

State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc.

1,512
313
1,512
12

Top Related Projects

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

10,485

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

77,006

Models and examples built with TensorFlow

30,447

💫 Industrial-strength Natural Language Processing (NLP) in Python

13,877

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Quick Overview

DeepMoji is a deep learning model for understanding emoji usage and emotional content in text. It was trained on a large dataset of tweets and can predict emoji usage, analyze sentiment, and detect sarcasm in text. The project includes pre-trained models and tools for using DeepMoji in various natural language processing tasks.

Pros

  • Powerful emotion and sentiment analysis capabilities
  • Pre-trained models available for immediate use
  • Can be fine-tuned for specific tasks or domains
  • Includes visualization tools for emoji predictions

Cons

  • Primarily focused on English language text
  • May not perform as well on formal or non-social media text
  • Requires deep learning expertise for advanced usage or modifications
  • Limited documentation for some advanced features

Code Examples

  1. Predicting emojis for a given text:
from deepmoji.sentence_tokenizer import SentenceTokenizer
from deepmoji.model_def import deepmoji_emojis
from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH

model = deepmoji_emojis(PRETRAINED_PATH)
st = SentenceTokenizer(VOCAB_PATH)

text = "I love machine learning!"
tokenized, _, _ = st.tokenize_sentences([text])
prob = model.predict(tokenized)

print(f"Top 5 emojis for '{text}':")
for i in prob[0].argsort()[-5:][::-1]:
    print(f"{emoji.emojize(EMOJI_UNICODE[i])} - {prob[0][i]:.4f}")
  1. Extracting features from text:
from deepmoji.model_def import deepmoji_feature_encoding
from deepmoji.global_variables import PRETRAINED_PATH

model = deepmoji_feature_encoding(PRETRAINED_PATH)
text = "This is amazing!"
tokenized, _, _ = st.tokenize_sentences([text])
features = model.predict(tokenized)

print(f"Feature vector for '{text}':")
print(features[0][:10])  # Print first 10 features
  1. Fine-tuning the model for sentiment analysis:
from deepmoji.finetuning import load_benchmark
from deepmoji.class_avg_finetuning import class_avg_finetune

data = load_benchmark("SE0714", vocab_path=VOCAB_PATH)
model = deepmoji_transfer(PRETRAINED_PATH)
model, f1 = class_avg_finetune(model, data['texts'], data['labels'], nb_classes=2, batch_size=32, epochs=1)

print(f"Fine-tuned model F1 score: {f1:.4f}")

Getting Started

  1. Install DeepMoji:

    pip install deepmoji
    
  2. Download pre-trained models:

    from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH
    import urllib.request
    
    urllib.request.urlretrieve("https://github.com/bfelbo/DeepMoji/raw/master/model/vocabulary.json", VOCAB_PATH)
    urllib.request.urlretrieve("https://github.com/bfelbo/DeepMoji/raw/master/model/deepmoji_weights.hdf5", PRETRAINED_PATH)
    
  3. Use DeepMoji in your project:

    from deepmoji.sentence_tokenizer import SentenceTokenizer
    from deepmoji.model_def import deepmoji_emojis
    from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH
    
    model = deepmoji_emojis(PRETRAINED_PATH)
    st = SentenceTokenizer(VOCAB_PATH)
    
    # Your code here
    

Competitor Comparisons

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Pros of transformers

  • Broader scope: Supports a wide range of NLP tasks and models
  • Active development: Regularly updated with new models and features
  • Extensive documentation and community support

Cons of transformers

  • Higher complexity: Steeper learning curve for beginners
  • Resource-intensive: Requires more computational power for many models

Code comparison

deepmoji:

from deepmoji.sentence_tokenizer import SentenceTokenizer
from deepmoji.model_def import deepmoji_emojis
from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH

model = deepmoji_emojis(PRETRAINED_PATH)
tokenizer = SentenceTokenizer(VOCAB_PATH)

transformers:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

Summary

transformers offers a more comprehensive NLP toolkit with broader applications, while deepmoji focuses specifically on emoji prediction. transformers benefits from active development and extensive documentation but may be more complex and resource-intensive. deepmoji provides a simpler, more focused solution for emoji-related tasks.

10,485

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Pros of ParlAI

  • Broader scope: ParlAI is a comprehensive platform for dialogue AI research, supporting various tasks and models
  • Active development: Regularly updated with new features and improvements
  • Extensive documentation and examples: Provides thorough guides and tutorials for users

Cons of ParlAI

  • Steeper learning curve: More complex to set up and use due to its extensive features
  • Resource-intensive: Requires more computational resources for training and running models

Code Comparison

DeepMoji:

from deepmoji.sentence_tokenizer import SentenceTokenizer
from deepmoji.model_def import deepmoji_emojis
from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH

model = deepmoji_emojis(PRETRAINED_PATH)
tokenizer = SentenceTokenizer(VOCAB_PATH)

ParlAI:

from parlai.core.params import ParlaiParser
from parlai.core.agents import create_agent
from parlai.core.worlds import create_task

parser = ParlaiParser(True, True)
opt = parser.parse_args()
agent = create_agent(opt)
world = create_task(opt, agent)

VADER Sentiment Analysis. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.

Pros of VADER Sentiment

  • Lightweight and easy to use, requiring no training or external data
  • Designed specifically for social media text, handling slang and emoticons well
  • Fast execution, suitable for real-time analysis

Cons of VADER Sentiment

  • Limited to English language text
  • May struggle with complex or nuanced sentiments
  • Less accurate on formal or domain-specific text compared to deep learning models

Code Comparison

VADER Sentiment:

from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

analyzer = SentimentIntensityAnalyzer()
sentiment = analyzer.polarity_scores("Hello world!")
print(sentiment)

DeepMoji:

from deepmoji.sentence_tokenizer import SentenceTokenizer
from deepmoji.model_def import deepmoji_emojis
from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH

model = deepmoji_emojis(PRETRAINED_PATH)
tokenizer = SentenceTokenizer(VOCAB_PATH)
tokens, _, _ = tokenizer.tokenize_sentences(["Hello world!"])
prob = model.predict(tokens)

DeepMoji offers more sophisticated sentiment analysis using deep learning, potentially providing more accurate results for complex texts. However, it requires more setup, computational resources, and may be slower for real-time applications compared to VADER Sentiment's simplicity and speed.

77,006

Models and examples built with TensorFlow

Pros of TensorFlow Models

  • Extensive collection of pre-trained models and implementations
  • Regularly updated with new models and features
  • Backed by Google, ensuring long-term support and development

Cons of TensorFlow Models

  • Larger and more complex repository, potentially overwhelming for beginners
  • May require more computational resources due to its comprehensive nature

Code Comparison

DeepMoji:

from deepmoji.sentence_tokenizer import SentenceTokenizer
from deepmoji.model_def import deepmoji_emojis
from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH

maxlen = 30
model = deepmoji_emojis(maxlen, PRETRAINED_PATH)

TensorFlow Models:

import tensorflow as tf
import tensorflow_hub as hub

model = hub.load("https://tfhub.dev/google/universal-sentence-encoder/4")
embeddings = model(["Hello, world!"])

Summary

DeepMoji focuses specifically on emoji prediction and sentiment analysis, while TensorFlow Models offers a broader range of pre-trained models and implementations. DeepMoji may be more suitable for projects specifically related to emoji prediction, while TensorFlow Models provides a more comprehensive toolkit for various machine learning tasks. The code examples demonstrate the different approaches: DeepMoji uses a custom model definition, while TensorFlow Models leverages TensorFlow Hub for easy model loading and usage.

30,447

💫 Industrial-strength Natural Language Processing (NLP) in Python

Pros of spaCy

  • Comprehensive NLP library with a wide range of functionalities
  • Highly optimized for performance and efficiency
  • Active development and large community support

Cons of spaCy

  • Steeper learning curve due to its extensive features
  • Larger memory footprint compared to more specialized libraries

Code Comparison

spaCy:

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sentence.")
for token in doc:
    print(token.text, token.pos_, token.dep_)

DeepMoji:

from deepmoji.sentence_tokenizer import SentenceTokenizer
from deepmoji.model_def import deepmoji_emojis

maxlen = 30
model = deepmoji_emojis(maxlen, weight_path)
tokenizer = SentenceTokenizer()

Key Differences

  • spaCy is a general-purpose NLP library, while DeepMoji focuses on emoji prediction and sentiment analysis
  • spaCy offers more comprehensive language processing capabilities, including part-of-speech tagging and dependency parsing
  • DeepMoji is more specialized for understanding emotional content in text

Use Cases

  • Choose spaCy for broad NLP tasks and when performance is crucial
  • Opt for DeepMoji when working specifically with emoji prediction or sentiment analysis in social media contexts
13,877

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Pros of Flair

  • Broader NLP capabilities: Flair offers a wide range of NLP tasks beyond sentiment analysis, including named entity recognition, part-of-speech tagging, and text classification.
  • Active development: Flair is regularly updated with new features and improvements, ensuring it stays current with the latest NLP advancements.
  • Extensive language support: Flair provides pre-trained models for numerous languages, making it more versatile for multilingual applications.

Cons of Flair

  • Higher complexity: Flair's broader scope may require a steeper learning curve compared to DeepMoji's focused approach to emoji prediction and sentiment analysis.
  • Resource intensity: Flair's more comprehensive models can be more resource-intensive, potentially requiring more computational power for training and inference.

Code Comparison

DeepMoji:

from deepmoji.sentence_tokenizer import SentenceTokenizer
from deepmoji.model_def import deepmoji_emojis
from deepmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH

model = deepmoji_emojis(PRETRAINED_PATH)
tokenizer = SentenceTokenizer(VOCAB_PATH)

Flair:

from flair.models import TextClassifier
from flair.data import Sentence

classifier = TextClassifier.load('en-sentiment')
sentence = Sentence('I love this movie!')
classifier.predict(sentence)

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

------ Update September 2023 ------

The online demo is no longer available as it's not possible for us to renew the certificate. The code in this repo still works, but you might have to make some changes for it to work in Python 3 (see the open PRs). You can also check out the PyTorch version of this algorithm called torchMoji made by HuggingFace.

DeepMoji

DeepMoji Youtube
(click image for video demonstration)

DeepMoji is a model trained on 1.2 billion tweets with emojis to understand how language is used to express emotions. Through transfer learning the model can obtain state-of-the-art performance on many emotion-related text modeling tasks.

See the paper or blog post for more details.

Overview

  • deepmoji/ contains all the underlying code needed to convert a dataset to our vocabulary and use our model.
  • examples/ contains short code snippets showing how to convert a dataset to our vocabulary, load up the model and run it on that dataset.
  • scripts/ contains code for processing and analysing datasets to reproduce results in the paper.
  • model/ contains the pretrained model and vocabulary.
  • data/ contains raw and processed datasets that we include in this repository for testing.
  • tests/ contains unit tests for the codebase.

To start out with, have a look inside the examples/ directory. See score_texts_emojis.py for how to use DeepMoji to extract emoji predictions, encode_texts.py for how to convert text into 2304-dimensional emotional feature vectors or finetune_youtube_last.py for how to use the model for transfer learning on a new dataset.

Please consider citing our paper if you use our model or code (see below for citation).

Frameworks

This code is based on Keras, which requires either Theano or Tensorflow as the backend. If you would rather use pyTorch there's an implementation available here, which has kindly been provided by Thomas Wolf.

Installation

We assume that you're using Python 2.7 with pip installed. As a backend you need to install either Theano (version 0.9+) or Tensorflow (version 1.3+). Once that's done you need to run the following inside the root directory to install the remaining dependencies:

pip install -e .

This will install the following dependencies:

Ensure that Keras uses your chosen backend. You can find the instructions here, under the Switching from one backend to another section.

Run the included script, which downloads the pretrained DeepMoji weights (~85MB) from here and places them in the model/ directory:

python scripts/download_weights.py

Testing

To run the tests, install nose. After installing, navigate to the tests/ directory and run:

nosetests -v

By default, this will also run finetuning tests. These tests train the model for one epoch and then check the resulting accuracy, which may take several minutes to finish. If you'd prefer to exclude those, run the following instead:

nosetests -v -a '!slow'

Disclaimer

This code has been tested to work with Python 2.7 on an Ubuntu 16.04 machine. It has not been optimized for efficiency, but should be fast enough for most purposes. We do not give any guarantees that there are no bugs - use the code on your own responsibility!

Contributions

We welcome pull requests if you feel like something could be improved. You can also greatly help us by telling us how you felt when writing your most recent tweets. Just click here to contribute.

License

This code and the pretrained model is licensed under the MIT license.

Benchmark datasets

The benchmark datasets are uploaded to this repository for convenience purposes only. They were not released by us and we do not claim any rights on them. Use the datasets at your responsibility and make sure you fulfill the licenses that they were released with. If you use any of the benchmark datasets please consider citing the original authors.

Twitter dataset

We sadly cannot release our large Twitter dataset of tweets with emojis due to licensing restrictions.

Citation

@inproceedings{felbo2017,
  title={Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm},
  author={Felbo, Bjarke and Mislove, Alan and S{\o}gaard, Anders and Rahwan, Iyad and Lehmann, Sune},
  booktitle={Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2017}
}