essentia

C++ library for audio and music analysis, description and synthesis, including Python bindings

3,106

565

3,106

416

View on GitHub

Top Related Projects

librosa

7,777

Python library for audio and music analysis

aubio

3,478

a library for audio and music analysis

annoy

13,889

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

magenta

19,599

Magenta: Music and Art Generation with Machine Intelligence

audio

2,701

Data manipulation and transformation for audio signal processing, powered by PyTorch

Quick Overview

Essentia is an open-source C++ library for audio analysis and music information retrieval. It provides a comprehensive set of algorithms for extracting features from audio signals, including spectral, temporal, and high-level descriptors. Essentia is designed to be efficient, modular, and easy to use, making it suitable for both research and commercial applications.

Pros

Extensive collection of audio analysis algorithms and music information retrieval tools
High performance and efficiency due to C++ implementation
Cross-platform compatibility (Linux, macOS, Windows)
Python bindings for easier integration and prototyping

Cons

Steep learning curve for beginners due to its comprehensive nature
Limited documentation for some advanced features
Requires C++ knowledge for optimal usage and customization
Dependency management can be challenging for some users

Code Examples

Loading an audio file and computing its spectrum:

import essentia.standard as es

# Load audio file
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()

# Compute spectrum
w = es.Windowing(type='hann')
spectrum = es.Spectrum()
spec = spectrum(w(audio))

Extracting the beat positions from an audio file:

import essentia.standard as es

# Load audio file
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()

# Extract beat positions
rhythm_extractor = es.RhythmExtractor2013()
bpm, beats, beats_confidence, _, beats_intervals = rhythm_extractor(audio)

print(f"BPM: {bpm}")
print(f"Beat positions: {beats}")

Computing MFCCs (Mel-Frequency Cepstral Coefficients):

import essentia.standard as es

# Load audio file
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()

# Compute MFCCs
window = es.Windowing(type='hann')
spectrum = es.Spectrum()
mfcc = es.MFCC()

mfccs = []
for frame in es.FrameGenerator(audio, frameSize=1024, hopSize=512):
    spec = spectrum(window(frame))
    mfcc_bands, mfcc_coeffs = mfcc(spec)
    mfccs.append(mfcc_coeffs)

Getting Started

To get started with Essentia, follow these steps:

Install Essentia using pip:
```
pip install essentia
```
Import the library in your Python script:
```
import essentia.standard as es
```

Load an audio file and perform basic analysis:

# Load audio file
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()

# Compute basic features
duration = len(audio) / 44100
loudness = es.Loudness()(audio)
pitch, confidence = es.PitchYinFFT()(es.Spectrum()(es.Windowing()(audio)))

print(f"Duration: {duration:.2f} seconds")
print(f"Loudness: {loudness:.2f} dB")
print(f"Pitch: {pitch:.2f} Hz (confidence: {confidence:.2f})")

For more advanced usage and detailed documentation, refer to the official Essentia website and GitHub repository.

Competitor Comparisons

librosa

7,777

Python library for audio and music analysis

Pros of librosa

Easier to install and use, with fewer dependencies
More Pythonic API and better integration with NumPy and SciPy
Extensive documentation and tutorials for beginners

Cons of librosa

Slower performance for some operations compared to Essentia
Limited support for real-time processing
Fewer advanced audio analysis features

Code Comparison

librosa:

import librosa

y, sr = librosa.load('audio.wav')
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
mfcc = librosa.feature.mfcc(y=y, sr=sr)

Essentia:

import essentia.standard as es

audio = es.MonoLoader(filename='audio.wav')()
rhythm_extractor = es.RhythmExtractor2013()
bpm, beats, _, _ = rhythm_extractor(audio)
mfcc = es.MFCC()(audio)

Both libraries offer similar functionality for basic audio analysis tasks, but Essentia provides more low-level control and advanced features. librosa is generally more user-friendly and better suited for quick prototyping and research, while Essentia is more powerful for complex audio processing tasks and real-time applications.

aubio

3,478

a library for audio and music analysis

Pros of aubio

Lightweight and efficient, with a focus on real-time processing
Supports multiple programming languages through bindings
Extensive command-line tools for quick audio analysis

Cons of aubio

Smaller feature set compared to Essentia
Less active development and community support
Limited documentation and examples for advanced use cases

Code Comparison

Essentia:

import essentia.standard as es

audio = es.MonoLoader(filename='audio.wav')()
beats = es.BeatTrackerMultiFeature()(audio)

aubio:

import aubio

source = aubio.source('audio.wav')
tempo = aubio.tempo("default", 1024, 512, source.samplerate)

beats = []
while True:
    samples, read = source()
    is_beat = tempo(samples)
    if is_beat:
        beats.append(tempo.get_last_s())
    if read < source.hop_size:
        break

Both libraries offer beat tracking functionality, but Essentia provides a more straightforward API for this task. aubio requires more manual setup and iteration, which can offer greater flexibility but may be less convenient for simple use cases.

annoy

13,889

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Pros of Annoy

Specialized for approximate nearest neighbor search, making it highly efficient for this specific task
Lightweight and easy to integrate into existing projects
Supports multiple distance metrics (Euclidean, Manhattan, Cosine, etc.)

Cons of Annoy

Limited to a single specific task, unlike Essentia's broader audio analysis capabilities
Less suitable for complex audio processing tasks or feature extraction
Smaller community and fewer contributors compared to Essentia

Code Comparison

Annoy (C++ with Python bindings):

from annoy import AnnoyIndex

t = AnnoyIndex(f, 'angular')
for i in range(1000):
    v = [random.gauss(0, 1) for z in range(f)]
    t.add_item(i, v)
t.build(10)

Essentia (C++ with Python bindings):

import essentia.standard as es

audio = es.MonoLoader(filename='audio.wav')()
w = es.Windowing(type='hann')
spectrum = es.Spectrum()
mfcc = es.MFCC()

for frame in es.FrameGenerator(audio, frameSize=1024, hopSize=512):
    mfcc_bands, mfcc_coeffs = mfcc(spectrum(w(frame)))

This comparison highlights the specialized nature of Annoy for nearest neighbor search, while Essentia offers a broader range of audio analysis tools. The code examples demonstrate Annoy's focus on indexing and searching, versus Essentia's audio processing capabilities.

magenta

19,599

Magenta: Music and Art Generation with Machine Intelligence

Pros of Magenta

Focuses on machine learning for music and art generation
Integrates well with TensorFlow and other Google AI tools
Offers pre-trained models for quick experimentation

Cons of Magenta

Narrower scope, primarily for creative AI applications
Steeper learning curve for those not familiar with machine learning
Less comprehensive audio analysis capabilities

Code Comparison

Magenta (Python):

import magenta

sequence = magenta.music.midi_io.midi_file_to_sequence_proto(midi_file)
notes = magenta.music.sequences_lib.extract_notes(sequence)

Essentia (C++):

#include <essentia/algorithmfactory.h>
#include <essentia/essentiamath.h>

AlgorithmFactory& factory = AlgorithmFactory::instance();
Algorithm* loader = factory.create("MonoLoader", "filename", audiofile);

Key Differences

Magenta is tailored for AI-driven music creation and artistic applications, while Essentia provides a broader set of audio analysis tools. Magenta leverages machine learning techniques, particularly with TensorFlow, whereas Essentia focuses on signal processing and feature extraction from audio.

Essentia offers more low-level control and a wider range of audio analysis algorithms, making it suitable for various audio processing tasks. Magenta, on the other hand, excels in generative tasks and creative applications of AI in music and art.

audio

2,701

Data manipulation and transformation for audio signal processing, powered by PyTorch

Pros of pytorch/audio

Seamless integration with PyTorch ecosystem for deep learning tasks
Extensive GPU acceleration support for faster processing
Comprehensive documentation and active community support

Cons of pytorch/audio

Steeper learning curve for users not familiar with PyTorch
More focused on deep learning applications, less versatile for general audio processing
Larger memory footprint due to PyTorch dependencies

Code Comparison

essentia:

#include <essentia/algorithmfactory.h>
#include <essentia/essentiamath.h>

AlgorithmFactory& factory = AlgorithmFactory::instance();
Algorithm* mfcc = factory.create("MFCC");

pytorch/audio:

import torchaudio

waveform, sample_rate = torchaudio.load("audio.wav")
mfcc = torchaudio.transforms.MFCC(sample_rate=sample_rate)
mfcc_features = mfcc(waveform)

Both libraries offer MFCC extraction, but essentia uses C++ with a factory pattern, while pytorch/audio leverages Python and integrates seamlessly with PyTorch tensors. essentia provides more low-level control, while pytorch/audio offers a more streamlined approach for deep learning applications.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Essentia

Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPLv3 license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.

Documentation online: http://essentia.upf.edu

Installation

The library is cross-platform and currently supports Linux, macOS, Windows, iOS and Android systems. Read installation instructions:

Install from master for the latest updates.

To use in Python (Linux x86_64, i686): pip install essentia or pip install essentia-tensorflow.

Docker images: https://hub.docker.com/r/mtgupf/essentia/

You can download and use prebuilt static binaries for a number of Essentia's command-line music extractors instead of installing the complete library

doc/sphinxdoc/extractors_out_of_box.rst

Quick start

Quick start using Python:

Command-line tools to compute common music descriptors:

doc/sphinxdoc/extractors_out_of_box.rst

Asking for help

Read frequently asked questions.

Create an issue on github or open a new discussion if your question was not answered before.

Versions

Official releases: https://github.com/MTG/essentia/releases

Github branches:

master: latest updates; if you got any problem, try it first.

If you use example extractors (located in src/examples), or your own code employing Essentia algorithms to compute descriptors, you should be aware of possible incompatibilities when using different versions of Essentia.

How to contribute

We are more than happy to collaborate and receive your contributions to Essentia. The best practice of submitting your code is by creating pull requests to our GitHub repository following our contribution policy. By submitting your code you authorize that it complies with the Developer's Certificate of Origin. For more details see: http://essentia.upf.edu/documentation/contribute.html

You are also more than welcome to suggest any improvements, including proposals for new algorithms, etc.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot