Convert Figma logo to code with AI

MTG logoessentia

C++ library for audio and music analysis, description and synthesis, including Python bindings

2,887
536
2,887
405

Top Related Projects

7,088

Python library for audio and music analysis

3,291

a library for audio and music analysis

13,167

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

19,130

Magenta: Music and Art Generation with Machine Intelligence

2,506

Data manipulation and transformation for audio signal processing, powered by PyTorch

Quick Overview

Essentia is an open-source C++ library for audio analysis and music information retrieval. It provides a comprehensive set of algorithms for extracting features from audio signals, including spectral, temporal, and high-level descriptors. Essentia is designed to be efficient, modular, and easy to use, making it suitable for both research and commercial applications.

Pros

  • Extensive collection of audio analysis algorithms and music information retrieval tools
  • High performance and efficiency due to C++ implementation
  • Cross-platform compatibility (Linux, macOS, Windows)
  • Python bindings for easier integration and prototyping

Cons

  • Steep learning curve for beginners due to its comprehensive nature
  • Limited documentation for some advanced features
  • Requires C++ knowledge for optimal usage and customization
  • Dependency management can be challenging for some users

Code Examples

  1. Loading an audio file and computing its spectrum:
import essentia.standard as es

# Load audio file
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()

# Compute spectrum
w = es.Windowing(type='hann')
spectrum = es.Spectrum()
spec = spectrum(w(audio))
  1. Extracting the beat positions from an audio file:
import essentia.standard as es

# Load audio file
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()

# Extract beat positions
rhythm_extractor = es.RhythmExtractor2013()
bpm, beats, beats_confidence, _, beats_intervals = rhythm_extractor(audio)

print(f"BPM: {bpm}")
print(f"Beat positions: {beats}")
  1. Computing MFCCs (Mel-Frequency Cepstral Coefficients):
import essentia.standard as es

# Load audio file
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()

# Compute MFCCs
window = es.Windowing(type='hann')
spectrum = es.Spectrum()
mfcc = es.MFCC()

mfccs = []
for frame in es.FrameGenerator(audio, frameSize=1024, hopSize=512):
    spec = spectrum(window(frame))
    mfcc_bands, mfcc_coeffs = mfcc(spec)
    mfccs.append(mfcc_coeffs)

Getting Started

To get started with Essentia, follow these steps:

  1. Install Essentia using pip:

    pip install essentia
    
  2. Import the library in your Python script:

    import essentia.standard as es
    
  3. Load an audio file and perform basic analysis:

    # Load audio file
    audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()
    
    # Compute basic features
    duration = len(audio) / 44100
    loudness = es.Loudness()(audio)
    pitch, confidence = es.PitchYinFFT()(es.Spectrum()(es.Windowing()(audio)))
    
    print(f"Duration: {duration:.2f} seconds")
    print(f"Loudness: {loudness:.2f} dB")
    print(f"Pitch: {pitch:.2f} Hz (confidence: {confidence:.2f})")
    

For more advanced usage and detailed documentation, refer to the official Essentia website and GitHub repository.

Competitor Comparisons

7,088

Python library for audio and music analysis

Pros of librosa

  • Easier to install and use, with fewer dependencies
  • More Pythonic API and better integration with NumPy and SciPy
  • Extensive documentation and tutorials for beginners

Cons of librosa

  • Slower performance for some operations compared to Essentia
  • Limited support for real-time processing
  • Fewer advanced audio analysis features

Code Comparison

librosa:

import librosa

y, sr = librosa.load('audio.wav')
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
mfcc = librosa.feature.mfcc(y=y, sr=sr)

Essentia:

import essentia.standard as es

audio = es.MonoLoader(filename='audio.wav')()
rhythm_extractor = es.RhythmExtractor2013()
bpm, beats, _, _ = rhythm_extractor(audio)
mfcc = es.MFCC()(audio)

Both libraries offer similar functionality for basic audio analysis tasks, but Essentia provides more low-level control and advanced features. librosa is generally more user-friendly and better suited for quick prototyping and research, while Essentia is more powerful for complex audio processing tasks and real-time applications.

3,291

a library for audio and music analysis

Pros of aubio

  • Lightweight and efficient, with a focus on real-time processing
  • Supports multiple programming languages through bindings
  • Extensive command-line tools for quick audio analysis

Cons of aubio

  • Smaller feature set compared to Essentia
  • Less active development and community support
  • Limited documentation and examples for advanced use cases

Code Comparison

Essentia:

import essentia.standard as es

audio = es.MonoLoader(filename='audio.wav')()
beats = es.BeatTrackerMultiFeature()(audio)

aubio:

import aubio

source = aubio.source('audio.wav')
tempo = aubio.tempo("default", 1024, 512, source.samplerate)

beats = []
while True:
    samples, read = source()
    is_beat = tempo(samples)
    if is_beat:
        beats.append(tempo.get_last_s())
    if read < source.hop_size:
        break

Both libraries offer beat tracking functionality, but Essentia provides a more straightforward API for this task. aubio requires more manual setup and iteration, which can offer greater flexibility but may be less convenient for simple use cases.

13,167

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Pros of Annoy

  • Specialized for approximate nearest neighbor search, making it highly efficient for this specific task
  • Lightweight and easy to integrate into existing projects
  • Supports multiple distance metrics (Euclidean, Manhattan, Cosine, etc.)

Cons of Annoy

  • Limited to a single specific task, unlike Essentia's broader audio analysis capabilities
  • Less suitable for complex audio processing tasks or feature extraction
  • Smaller community and fewer contributors compared to Essentia

Code Comparison

Annoy (C++ with Python bindings):

from annoy import AnnoyIndex

t = AnnoyIndex(f, 'angular')
for i in range(1000):
    v = [random.gauss(0, 1) for z in range(f)]
    t.add_item(i, v)
t.build(10)

Essentia (C++ with Python bindings):

import essentia.standard as es

audio = es.MonoLoader(filename='audio.wav')()
w = es.Windowing(type='hann')
spectrum = es.Spectrum()
mfcc = es.MFCC()

for frame in es.FrameGenerator(audio, frameSize=1024, hopSize=512):
    mfcc_bands, mfcc_coeffs = mfcc(spectrum(w(frame)))

This comparison highlights the specialized nature of Annoy for nearest neighbor search, while Essentia offers a broader range of audio analysis tools. The code examples demonstrate Annoy's focus on indexing and searching, versus Essentia's audio processing capabilities.

19,130

Magenta: Music and Art Generation with Machine Intelligence

Pros of Magenta

  • Focuses on machine learning for music and art generation
  • Integrates well with TensorFlow and other Google AI tools
  • Offers pre-trained models for quick experimentation

Cons of Magenta

  • Narrower scope, primarily for creative AI applications
  • Steeper learning curve for those not familiar with machine learning
  • Less comprehensive audio analysis capabilities

Code Comparison

Magenta (Python):

import magenta

sequence = magenta.music.midi_io.midi_file_to_sequence_proto(midi_file)
notes = magenta.music.sequences_lib.extract_notes(sequence)

Essentia (C++):

#include <essentia/algorithmfactory.h>
#include <essentia/essentiamath.h>

AlgorithmFactory& factory = AlgorithmFactory::instance();
Algorithm* loader = factory.create("MonoLoader", "filename", audiofile);

Key Differences

Magenta is tailored for AI-driven music creation and artistic applications, while Essentia provides a broader set of audio analysis tools. Magenta leverages machine learning techniques, particularly with TensorFlow, whereas Essentia focuses on signal processing and feature extraction from audio.

Essentia offers more low-level control and a wider range of audio analysis algorithms, making it suitable for various audio processing tasks. Magenta, on the other hand, excels in generative tasks and creative applications of AI in music and art.

2,506

Data manipulation and transformation for audio signal processing, powered by PyTorch

Pros of pytorch/audio

  • Seamless integration with PyTorch ecosystem for deep learning tasks
  • Extensive GPU acceleration support for faster processing
  • Comprehensive documentation and active community support

Cons of pytorch/audio

  • Steeper learning curve for users not familiar with PyTorch
  • More focused on deep learning applications, less versatile for general audio processing
  • Larger memory footprint due to PyTorch dependencies

Code Comparison

essentia:

#include <essentia/algorithmfactory.h>
#include <essentia/essentiamath.h>

AlgorithmFactory& factory = AlgorithmFactory::instance();
Algorithm* mfcc = factory.create("MFCC");

pytorch/audio:

import torchaudio

waveform, sample_rate = torchaudio.load("audio.wav")
mfcc = torchaudio.transforms.MFCC(sample_rate=sample_rate)
mfcc_features = mfcc(waveform)

Both libraries offer MFCC extraction, but essentia uses C++ with a factory pattern, while pytorch/audio leverages Python and integrates seamlessly with PyTorch tensors. essentia provides more low-level control, while pytorch/audio offers a more streamlined approach for deep learning applications.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Essentia

Build wheels status License: AGPL v3 PyPI downloads: essentia PyPI downloads: essentia-tensorflow

Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPLv3 license. It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. The library is also wrapped in Python and includes a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.

Documentation online: http://essentia.upf.edu

Installation

The library is cross-platform and currently supports Linux, macOS, Windows, iOS and Android systems. Read installation instructions:

Install from master for the latest updates.

To use in Python (Linux x86_64, i686): pip install essentia or pip install essentia-tensorflow.

Docker images: https://hub.docker.com/r/mtgupf/essentia/

You can download and use prebuilt static binaries for a number of Essentia's command-line music extractors instead of installing the complete library

Quick start

Quick start using Python:

Command-line tools to compute common music descriptors:

Asking for help

Read frequently asked questions.

Create an issue on github or open a new discussion if your question was not answered before.

Versions

Official releases: https://github.com/MTG/essentia/releases

Github branches:

  • master: latest updates; if you got any problem, try it first.

If you use example extractors (located in src/examples), or your own code employing Essentia algorithms to compute descriptors, you should be aware of possible incompatibilities when using different versions of Essentia.

How to contribute

We are more than happy to collaborate and receive your contributions to Essentia. The best practice of submitting your code is by creating pull requests to our GitHub repository following our contribution policy. By submitting your code you authorize that it complies with the Developer's Certificate of Origin. For more details see: http://essentia.upf.edu/documentation/contribute.html

You are also more than welcome to suggest any improvements, including proposals for new algorithms, etc.