madmom

Python audio and music signal processing library

1,452

243

1,452

View on GitHub

Top Related Projects

librosa

7,777

Python library for audio and music analysis

aubio

3,478

a library for audio and music analysis

essentia

3,106

C++ library for audio and music analysis, description and synthesis, including Python bindings

basic-pitch

4,116

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

spleeter

27,042

Deezer source separation library including pretrained models.

crepe

1,257

CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)

Quick Overview

Madmom is a Python library for music information retrieval (MIR) tasks. It provides state-of-the-art algorithms for various music analysis tasks, including beat tracking, onset detection, and chord recognition. The library is designed to be efficient and easy to use, making it suitable for both research and practical applications.

Pros

Comprehensive set of MIR algorithms and features
Efficient implementation, suitable for real-time processing
Well-documented with examples and tutorials
Actively maintained and regularly updated

Cons

Steep learning curve for beginners in MIR
Limited support for non-Western music styles
Dependency on external libraries may complicate installation
Some advanced features require deep understanding of MIR concepts

Code Examples

Beat tracking:

from madmom.features.beats import RNNBeatProcessor
from madmom.features.beats import DBNBeatTrackingProcessor

# Process the audio file
proc = RNNBeatProcessor()
act = proc('path/to/audio/file.wav')

# Track the beats
tracker = DBNBeatTrackingProcessor(fps=100)
beats = tracker(act)

print(beats)

Chord recognition:

from madmom.features.chords import DeepChromaChordRecognitionProcessor

# Recognize chords
proc = DeepChromaChordRecognitionProcessor()
chords = proc('path/to/audio/file.wav')

for chord in chords:
    print(f"Time: {chord[0]:.2f}, Chord: {chord[1]}")

Onset detection:

from madmom.features.onsets import CNNOnsetProcessor
from madmom.features.onsets import OnsetPeakPickingProcessor

# Detect onsets
proc = CNNOnsetProcessor()
act = proc('path/to/audio/file.wav')

# Pick onset peaks
picker = OnsetPeakPickingProcessor(threshold=0.5)
onsets = picker(act)

print(onsets)

Getting Started

To get started with Madmom, follow these steps:

Install Madmom using pip:
```
pip install madmom
```

Import the required modules:

from madmom.features.beats import RNNBeatProcessor, DBNBeatTrackingProcessor
from madmom.features.chords import DeepChromaChordRecognitionProcessor
from madmom.features.onsets import CNNOnsetProcessor, OnsetPeakPickingProcessor

Process an audio file:

# Example: Beat tracking
proc = RNNBeatProcessor()
act = proc('path/to/audio/file.wav')
tracker = DBNBeatTrackingProcessor(fps=100)
beats = tracker(act)
print(beats)

For more detailed information and examples, refer to the official documentation at https://madmom.readthedocs.io/.

Competitor Comparisons

librosa

7,777

Python library for audio and music analysis

Pros of librosa

More comprehensive and general-purpose audio processing library
Extensive documentation and tutorials available
Larger community and more frequent updates

Cons of librosa

Slower performance for some tasks compared to madmom
Less specialized for music information retrieval (MIR) tasks

Code Comparison

librosa:

import librosa

y, sr = librosa.load('audio.wav')
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)

madmom:

from madmom.features.beats import RNNBeatProcessor
from madmom.features.tempo import TempoEstimationProcessor

proc = RNNBeatProcessor()
beats = proc('audio.wav')
tempo = TempoEstimationProcessor(beats=beats)

Both libraries offer functionality for beat tracking and tempo estimation, but madmom uses a more specialized approach with separate processors for beats and tempo. librosa provides a more straightforward, all-in-one function for beat tracking, which includes tempo estimation.

librosa is generally easier to use for beginners and offers a wider range of audio processing functions. madmom, on the other hand, is more focused on MIR tasks and may provide better performance for specific music analysis applications.

aubio

3,478

a library for audio and music analysis

Pros of aubio

Lightweight and efficient C library with Python bindings
Broader range of audio analysis tasks (pitch detection, onset detection, tempo estimation, etc.)
Extensive command-line tools for quick audio analysis

Cons of aubio

Less focus on machine learning-based approaches
Smaller community and fewer recent updates
Limited support for deep learning models

Code comparison

aubio

import aubio

# Create a pitch detection object
pitch_o = aubio.pitch("yin", 2048, 512, 44100)

# Process audio and get pitch
pitch = pitch_o(audio_samples)[0]

madmom

from madmom.features.beats import RNNBeatProcessor

# Create a beat detection processor
proc = RNNBeatProcessor()

# Process audio file and get beat positions
beats = proc(audio_file)

Both libraries offer audio analysis capabilities, but they differ in their approaches and focus areas. aubio provides a lightweight C library with Python bindings, suitable for various audio tasks, while madmom emphasizes machine learning-based techniques, particularly for music information retrieval tasks. aubio offers more general-purpose audio analysis tools, whereas madmom specializes in music-specific analysis using state-of-the-art algorithms.

essentia

3,106

C++ library for audio and music analysis, description and synthesis, including Python bindings

Pros of Essentia

Broader range of audio analysis algorithms, including music and speech processing
Extensive documentation and tutorials for easier adoption
Supports multiple programming languages (C++, Python, JavaScript)

Cons of Essentia

Steeper learning curve due to its extensive feature set
Potentially slower execution for some tasks compared to Madmom's optimized algorithms
Larger codebase and dependencies, which may increase complexity

Code Comparison

Essentia example (Python):

import essentia.standard as es

audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()
beats = es.BeatTrackerMultiFeature()
beats_positions = beats(audio)

Madmom example (Python):

from madmom.features.beats import RNNBeatProcessor
from madmom.features.tempo import TempoEstimationProcessor

proc = RNNBeatProcessor()
beats = TempoEstimationProcessor()(proc('audio.wav'))

Both libraries offer powerful audio analysis capabilities, but Essentia provides a more comprehensive toolkit for various audio processing tasks, while Madmom focuses on music-specific analysis with highly optimized algorithms.

basic-pitch

4,116

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Pros of Basic-pitch

More focused on pitch detection and transcription tasks
Utilizes modern deep learning techniques for improved accuracy
Actively maintained by Spotify, with recent updates and contributions

Cons of Basic-pitch

Limited scope compared to Madmom's broader feature set
Less extensive documentation and examples
Smaller community and fewer third-party integrations

Code Comparison

Basic-pitch:

from basic_pitch.inference import predict
model_output, midi_data, note_events = predict('audio.wav')

Madmom:

from madmom.features.beats import RNNBeatProcessor
proc = RNNBeatProcessor()
beats = proc('audio.wav')

Basic-pitch focuses on pitch-related tasks, while Madmom offers a wider range of music information retrieval functionalities. Basic-pitch leverages deep learning for improved accuracy in pitch detection, but Madmom provides a more comprehensive set of tools for various music analysis tasks.

Basic-pitch benefits from active development by Spotify, ensuring up-to-date techniques and compatibility. However, Madmom has a larger community and more extensive documentation, making it easier for users to get started and find support.

The code examples demonstrate the simplicity of Basic-pitch for pitch-related tasks, while Madmom's example shows its versatility in handling different music analysis tasks like beat detection.

spleeter

27,042

Deezer source separation library including pretrained models.

Pros of Spleeter

Specialized in audio source separation, particularly for isolating vocals and instruments
Offers pre-trained models for quick and easy use
Supports both CPU and GPU processing for faster performance

Cons of Spleeter

Limited to source separation tasks, less versatile than Madmom
Requires more computational resources, especially for high-quality separations
Less suitable for real-time applications due to processing requirements

Code Comparison

Spleeter (Python):

from spleeter.separator import Separator

separator = Separator('spleeter:2stems')
separator.separate_to_file('audio_example.mp3', 'output/')

Madmom (Python):

from madmom.features.beats import RNNBeatProcessor
from madmom.features.tempo import TempoEstimationProcessor

proc = RNNBeatProcessor()
beats = proc('audio_example.wav')
tempo = TempoEstimationProcessor()(beats)

Key Differences

Spleeter focuses on source separation, while Madmom offers a broader range of music information retrieval tasks
Madmom provides more low-level audio processing capabilities and is better suited for music analysis tasks
Spleeter uses deep learning models, whereas Madmom employs various signal processing and machine learning techniques
Madmom is more lightweight and suitable for real-time applications, while Spleeter excels in high-quality offline processing

crepe

1,257

CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)

Pros of crepe

Specialized in pitch estimation, offering high accuracy for monophonic audio
Implements a deep neural network approach, potentially providing better results in complex scenarios
Provides a command-line interface for easy use and integration

Cons of crepe

Limited to pitch estimation, while madmom offers a broader range of music information retrieval tasks
May require more computational resources due to its neural network architecture
Less extensive documentation compared to madmom

Code Comparison

madmom example (beat tracking):

from madmom.features.beats import RNNBeatProcessor
from madmom.features.beats import DBNBeatTrackingProcessor

proc = DBNBeatTrackingProcessor(fps=100)
act = RNNBeatProcessor()('audio_file.wav')
beats = proc(act)

crepe example (pitch estimation):

import crepe
from scipy.io import wavfile

sr, audio = wavfile.read('audio_file.wav')
time, frequency, confidence, activation = crepe.predict(audio, sr, viterbi=True)

Both libraries offer Python interfaces, but madmom provides a more comprehensive set of tools for various music analysis tasks, while crepe focuses specifically on pitch estimation using a deep learning approach.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

====== madmom

Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks.

The library is internally used by the Department of Computational Perception, Johannes Kepler University, Linz, Austria (http://www.cp.jku.at) and the Austrian Research Institute for Artificial Intelligence (OFAI), Vienna, Austria (http://www.ofai.at).

Possible acronyms are:

Madmom Analyzes Digitized Music Of Musicians
Mostly Audio / Dominantly Music Oriented Modules

It includes reference implementations for some music information retrieval algorithms, please see the References_ section.

Documentation

Documentation of the package can be found online http://madmom.readthedocs.org

License

The package has two licenses, one for source code and one for model/data files.

Source code

Unless indicated otherwise, all source code files are published under the BSD license. For details, please see the LICENSE <LICENSE>_ file.

Model and data files

Unless indicated otherwise, all model and data files are distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 <http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode>_ license.

If you want to include any of these files (or a variation or modification thereof) or technology which utilises them in a commercial product, please contact Gerhard Widmer <http://www.cp.jku.at/people/widmer/>_.

Installation

Please do not try to install from the .zip files provided by GitHub. Rather install it from package (if you just want to use it) or source (if you plan to use it for development) by following the instructions below. Whichever variant you choose, please make sure that all prerequisites are installed.

Prerequisites

To install the madmom package, you must have either Python 2.7 or Python 3.5 or newer and the following packages installed:

numpy <http://www.numpy.org>_
scipy <http://www.scipy.org>_
cython <http://www.cython.org>_
mido <https://github.com/olemb/mido>_

In order to test your installation, process live audio input, or have improved FFT performance, additionally install these packages:

pytest <https://www.pytest.org/>_
pyaudio <http://people.csail.mit.edu/hubert/pyaudio/>_
pyfftw <https://github.com/pyFFTW/pyFFTW/>_

If you need support for audio files other than .wav with a sample rate of 44.1kHz and 16 bit depth, you need ffmpeg (avconv on Ubuntu Linux has some decoding bugs, so we advise not to use it!).

Please refer to the requirements.txt <requirements.txt>_ file for the minimum required versions and make sure that these modules are up to date, otherwise it can result in unexpected errors or false computations!

Install from package

The instructions given here should be used if you just want to install the package, e.g. to run the bundled programs or use some functionality for your own project. If you intend to change anything within the madmom package, please follow the steps in the next section.

The easiest way to install the package is via pip from the PyPI (Python Package Index) <https://pypi.python.org/pypi>_::

pip install madmom

This includes the latest code and trained models and will install all dependencies automatically.

You might need higher privileges (use su or sudo) to install the package, model files and scripts globally. Alternatively you can install the package locally (i.e. only for you) by adding the --user argument::

pip install --user madmom

This will also install the executable programs to a common place (e.g. /usr/local/bin), which should be in your $PATH already. If you installed the package locally, the programs will be copied to a folder which might not be included in your $PATH (e.g. ~/Library/Python/2.7/bin on Mac OS X or ~/.local/bin on Ubuntu Linux, pip will tell you). Thus the programs need to be called explicitely or you can add their install path to your $PATH environment variable::

export PATH='path/to/scripts':$PATH

Install from source

If you plan to use the package as a developer, clone the Git repository::

git clone --recursive https://github.com/CPJKU/madmom.git

Since the pre-trained model/data files are not included in this repository but rather added as a Git submodule, you either have to clone the repo recursively. This is equivalent to these steps::

git clone https://github.com/CPJKU/madmom.git
cd madmom
git submodule update --init --remote

Then you can simply install the package in development mode::

python setup.py develop --user

To run the included tests::

python setup.py pytest

Upgrade of existing installations

To upgrade the package, please use the same mechanism (pip vs. source) as you did for installation. If you want to change from package to source, please uninstall the package first.

Upgrade a package


Simply upgrade the package via pip::

    pip install --upgrade madmom [--user]

If some of the provided programs or models changed (please refer to the
CHANGELOG) you should first uninstall the package and then reinstall::

    pip uninstall madmom
    pip install madmom [--user]

Upgrade from source

Simply pull the latest sources::

git pull

To update the models contained in the submodule::

git submodule update

If any of the .pyx or .pxd files changed, you have to recompile the modules with Cython::

python setup.py build_ext --inplace

Package structure

The package has a very simple structure, divided into the following folders:

/bin <bin>_ this folder includes example programs (i.e. executable algorithms) /docs <docs>_ package documentation /madmom <madmom>_ the actual Python package /madmom/audio <madmom/audio>_ low level features (e.g. audio file handling, STFT) /madmom/evaluation <madmom/evaluation>_ evaluation code /madmom/features <madmom/features>_ higher level features (e.g. onsets, beats) /madmom/ml <madmom/ml>_ machine learning stuff (e.g. RNNs, HMMs) /madmom/models <../../../madmom_models>_ pre-trained model/data files (see the License section) /madmom/utils <madmom/utils>_ misc stuff (e.g. MIDI and general file handling) /tests <tests>_ tests

Executable programs

The package includes executable programs in the /bin <bin>_ folder. If you installed the package, they were copied to a common place.

All scripts can be run in different modes: in single file mode to process a single audio file and write the output to STDOUT or the given output file::

DBNBeatTracker single [-o OUTFILE] INFILE

If multiple audio files should be processed, the scripts can also be run in batch mode to write the outputs to files with the given suffix::

DBNBeatTracker batch [-o OUTPUT_DIR] [-s OUTPUT_SUFFIX] FILES

If no output directory is given, the program writes the output files to the same location as the audio files.

Some programs can also be run in online mode, i.e. operate on live audio signals. This requires pyaudio <http://people.csail.mit.edu/hubert/pyaudio/>_ to be installed::

DBNBeatTracker online [-o OUTFILE] [INFILE]

The pickle mode can be used to store the used parameters to be able to exactly reproduce experiments.

Please note that the program itself as well as the modes have help messages::

DBNBeatTracker -h

DBNBeatTracker single -h

DBNBeatTracker batch -h

DBNBeatTracker online -h

DBNBeatTracker pickle -h

will give different help messages.

Additional resources

Mailing list

The mailing list <https://groups.google.com/d/forum/madmom-users>_ should be used to get in touch with the developers and other users.

Wiki

The wiki can be found here: https://github.com/CPJKU/madmom/wiki

FAQ

Frequently asked questions can be found here: https://github.com/CPJKU/madmom/wiki/FAQ

Citation

If you use madmom in your work, please consider citing it:

.. code-block:: latex

@inproceedings{madmom, Title = {{madmom: a new Python Audio and Music Signal Processing Library}}, Author = {B{"o}ck, Sebastian and Korzeniowski, Filip and Schl{"u}ter, Jan and Krebs, Florian and Widmer, Gerhard}, Booktitle = {Proceedings of the 24th ACM International Conference on Multimedia}, Month = {10}, Year = {2016}, Pages = {1174--1178}, Address = {Amsterdam, The Netherlands}, Doi = {10.1145/2964284.2973795} }

References

.. [1] Florian Eyben, Sebastian BÃ¶ck, BjÃ¶rn Schuller and Alex Graves, Universal Onset Detection with bidirectional Long Short-Term Memory Neural Networks, Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), 2010. .. [2] Sebastian BÃ¶ck and Markus Schedl, Enhanced Beat Tracking with Context-Aware Neural Networks, Proceedings of the 14th International Conference on Digital Audio Effects (DAFx), 2011. .. [3] Sebastian BÃ¶ck and Markus Schedl, Polyphonic Piano Note Transcription with Recurrent Neural Networks, Proceedings of the 37th International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012. .. [4] Sebastian BÃ¶ck, Andreas Arzt, Florian Krebs and Markus Schedl, Online Real-time Onset Detection with Recurrent Neural Networks, Proceedings of the 15th International Conference on Digital Audio Effects (DAFx), 2012. .. [5] Sebastian BÃ¶ck, Florian Krebs and Markus Schedl, Evaluating the Online Capabilities of Onset Detection Methods, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012. .. [6] Sebastian BÃ¶ck and Gerhard Widmer, Maximum Filter Vibrato Suppression for Onset Detection, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013. .. [7] Sebastian BÃ¶ck and Gerhard Widmer, Local Group Delay based Vibrato and Tremolo Suppression for Onset Detection, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2013. .. [8] Florian Krebs, Sebastian BÃ¶ck and Gerhard Widmer, Rhythmic Pattern Modelling for Beat and Downbeat Tracking in Musical Audio, Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR), 2013. .. [9] Sebastian BÃ¶ck, Jan SchlÃ¼ter and Gerhard Widmer, Enhanced Peak Picking for Onset Detection with Recurrent Neural Networks, Proceedings of the 6th International Workshop on Machine Learning and Music (MML), 2013. .. [10] Sebastian BÃ¶ck, Florian Krebs and Gerhard Widmer, A Multi-Model Approach to Beat Tracking Considering Heterogeneous Music Styles, Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), 2014. .. [11] Filip Korzeniowski, Sebastian BÃ¶ck and Gerhard Widmer, Probabilistic Extraction of Beat Positions from a Beat Activation Function, Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), 2014. .. [12] Sebastian BÃ¶ck, Florian Krebs and Gerhard Widmer, Accurate Tempo Estimation based on Recurrent Neural Networks and Resonating Comb Filters, Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), 2015. .. [13] Florian Krebs, Sebastian BÃ¶ck and Gerhard Widmer, An Efficient State Space Model for Joint Tempo and Meter Tracking, Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), 2015. .. [14] Sebastian BÃ¶ck, Florian Krebs and Gerhard Widmer, Joint Beat and Downbeat Tracking with Recurrent Neural Networks, Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016. .. [15] Filip Korzeniowski and Gerhard Widmer, Feature Learning for Chord Recognition: The Deep Chroma Extractor, Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016. .. [16] Florian Krebs, Sebastian BÃ¶ck, Matthias Dorfer and Gerhard Widmer, Downbeat Tracking Using Beat-Synchronous Features and Recurrent Networks, Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016. .. [17] Filip Korzeniowski and Gerhard Widmer, A Fully Convolutional Deep Auditory Model for Musical Chord Recognition, Proceedings of IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2016. .. [18] Filip Korzeniowski and Gerhard Widmer, Genre-Agnostic Key Classification with Convolutional Neural Networks, Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018. .. [19] Rainer Kelz, Sebastian BÃ¶ck and Gerhard Widmer, Deep Polyphonic ADSR Piano Note Transcription, Proceedings of the 44th International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019.

Acknowledgements

Supported by the European Commission through the GiantSteps project <http://www.giantsteps-project.eu>_ (FP7 grant agreement no. 610591) and the Phenicx project <http://phenicx.upf.edu>_ (FP7 grant agreement no. 601166) as well as the Austrian Science Fund (FWF) <https://www.fwf.ac.at>_ project Z159.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot