Top Related Projects
Python library for audio and music analysis
a library for audio and music analysis
C++ library for audio and music analysis, description and synthesis, including Python bindings
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Deezer source separation library including pretrained models.
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
Quick Overview
Madmom is a Python library for music information retrieval (MIR) tasks. It provides state-of-the-art algorithms for various music analysis tasks, including beat tracking, onset detection, and chord recognition. The library is designed to be efficient and easy to use, making it suitable for both research and practical applications.
Pros
- Comprehensive set of MIR algorithms and features
- Efficient implementation, suitable for real-time processing
- Well-documented with examples and tutorials
- Actively maintained and regularly updated
Cons
- Steep learning curve for beginners in MIR
- Limited support for non-Western music styles
- Dependency on external libraries may complicate installation
- Some advanced features require deep understanding of MIR concepts
Code Examples
- Beat tracking:
from madmom.features.beats import RNNBeatProcessor
from madmom.features.beats import DBNBeatTrackingProcessor
# Process the audio file
proc = RNNBeatProcessor()
act = proc('path/to/audio/file.wav')
# Track the beats
tracker = DBNBeatTrackingProcessor(fps=100)
beats = tracker(act)
print(beats)
- Chord recognition:
from madmom.features.chords import DeepChromaChordRecognitionProcessor
# Recognize chords
proc = DeepChromaChordRecognitionProcessor()
chords = proc('path/to/audio/file.wav')
for chord in chords:
print(f"Time: {chord[0]:.2f}, Chord: {chord[1]}")
- Onset detection:
from madmom.features.onsets import CNNOnsetProcessor
from madmom.features.onsets import OnsetPeakPickingProcessor
# Detect onsets
proc = CNNOnsetProcessor()
act = proc('path/to/audio/file.wav')
# Pick onset peaks
picker = OnsetPeakPickingProcessor(threshold=0.5)
onsets = picker(act)
print(onsets)
Getting Started
To get started with Madmom, follow these steps:
-
Install Madmom using pip:
pip install madmom
-
Import the required modules:
from madmom.features.beats import RNNBeatProcessor, DBNBeatTrackingProcessor from madmom.features.chords import DeepChromaChordRecognitionProcessor from madmom.features.onsets import CNNOnsetProcessor, OnsetPeakPickingProcessor
-
Process an audio file:
# Example: Beat tracking proc = RNNBeatProcessor() act = proc('path/to/audio/file.wav') tracker = DBNBeatTrackingProcessor(fps=100) beats = tracker(act) print(beats)
For more detailed information and examples, refer to the official documentation at https://madmom.readthedocs.io/.
Competitor Comparisons
Python library for audio and music analysis
Pros of librosa
- More comprehensive and general-purpose audio processing library
- Extensive documentation and tutorials available
- Larger community and more frequent updates
Cons of librosa
- Slower performance for some tasks compared to madmom
- Less specialized for music information retrieval (MIR) tasks
Code Comparison
librosa:
import librosa
y, sr = librosa.load('audio.wav')
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
madmom:
from madmom.features.beats import RNNBeatProcessor
from madmom.features.tempo import TempoEstimationProcessor
proc = RNNBeatProcessor()
beats = proc('audio.wav')
tempo = TempoEstimationProcessor(beats=beats)
Both libraries offer functionality for beat tracking and tempo estimation, but madmom uses a more specialized approach with separate processors for beats and tempo. librosa provides a more straightforward, all-in-one function for beat tracking, which includes tempo estimation.
librosa is generally easier to use for beginners and offers a wider range of audio processing functions. madmom, on the other hand, is more focused on MIR tasks and may provide better performance for specific music analysis applications.
a library for audio and music analysis
Pros of aubio
- Lightweight and efficient C library with Python bindings
- Broader range of audio analysis tasks (pitch detection, onset detection, tempo estimation, etc.)
- Extensive command-line tools for quick audio analysis
Cons of aubio
- Less focus on machine learning-based approaches
- Smaller community and fewer recent updates
- Limited support for deep learning models
Code comparison
aubio
import aubio
# Create a pitch detection object
pitch_o = aubio.pitch("yin", 2048, 512, 44100)
# Process audio and get pitch
pitch = pitch_o(audio_samples)[0]
madmom
from madmom.features.beats import RNNBeatProcessor
# Create a beat detection processor
proc = RNNBeatProcessor()
# Process audio file and get beat positions
beats = proc(audio_file)
Both libraries offer audio analysis capabilities, but they differ in their approaches and focus areas. aubio provides a lightweight C library with Python bindings, suitable for various audio tasks, while madmom emphasizes machine learning-based techniques, particularly for music information retrieval tasks. aubio offers more general-purpose audio analysis tools, whereas madmom specializes in music-specific analysis using state-of-the-art algorithms.
C++ library for audio and music analysis, description and synthesis, including Python bindings
Pros of Essentia
- Broader range of audio analysis algorithms, including music and speech processing
- Extensive documentation and tutorials for easier adoption
- Supports multiple programming languages (C++, Python, JavaScript)
Cons of Essentia
- Steeper learning curve due to its extensive feature set
- Potentially slower execution for some tasks compared to Madmom's optimized algorithms
- Larger codebase and dependencies, which may increase complexity
Code Comparison
Essentia example (Python):
import essentia.standard as es
audio = es.MonoLoader(filename='audio.wav', sampleRate=44100)()
beats = es.BeatTrackerMultiFeature()
beats_positions = beats(audio)
Madmom example (Python):
from madmom.features.beats import RNNBeatProcessor
from madmom.features.tempo import TempoEstimationProcessor
proc = RNNBeatProcessor()
beats = TempoEstimationProcessor()(proc('audio.wav'))
Both libraries offer powerful audio analysis capabilities, but Essentia provides a more comprehensive toolkit for various audio processing tasks, while Madmom focuses on music-specific analysis with highly optimized algorithms.
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Pros of Basic-pitch
- More focused on pitch detection and transcription tasks
- Utilizes modern deep learning techniques for improved accuracy
- Actively maintained by Spotify, with recent updates and contributions
Cons of Basic-pitch
- Limited scope compared to Madmom's broader feature set
- Less extensive documentation and examples
- Smaller community and fewer third-party integrations
Code Comparison
Basic-pitch:
from basic_pitch.inference import predict
model_output, midi_data, note_events = predict('audio.wav')
Madmom:
from madmom.features.beats import RNNBeatProcessor
proc = RNNBeatProcessor()
beats = proc('audio.wav')
Basic-pitch focuses on pitch-related tasks, while Madmom offers a wider range of music information retrieval functionalities. Basic-pitch leverages deep learning for improved accuracy in pitch detection, but Madmom provides a more comprehensive set of tools for various music analysis tasks.
Basic-pitch benefits from active development by Spotify, ensuring up-to-date techniques and compatibility. However, Madmom has a larger community and more extensive documentation, making it easier for users to get started and find support.
The code examples demonstrate the simplicity of Basic-pitch for pitch-related tasks, while Madmom's example shows its versatility in handling different music analysis tasks like beat detection.
Deezer source separation library including pretrained models.
Pros of Spleeter
- Specialized in audio source separation, particularly for isolating vocals and instruments
- Offers pre-trained models for quick and easy use
- Supports both CPU and GPU processing for faster performance
Cons of Spleeter
- Limited to source separation tasks, less versatile than Madmom
- Requires more computational resources, especially for high-quality separations
- Less suitable for real-time applications due to processing requirements
Code Comparison
Spleeter (Python):
from spleeter.separator import Separator
separator = Separator('spleeter:2stems')
separator.separate_to_file('audio_example.mp3', 'output/')
Madmom (Python):
from madmom.features.beats import RNNBeatProcessor
from madmom.features.tempo import TempoEstimationProcessor
proc = RNNBeatProcessor()
beats = proc('audio_example.wav')
tempo = TempoEstimationProcessor()(beats)
Key Differences
- Spleeter focuses on source separation, while Madmom offers a broader range of music information retrieval tasks
- Madmom provides more low-level audio processing capabilities and is better suited for music analysis tasks
- Spleeter uses deep learning models, whereas Madmom employs various signal processing and machine learning techniques
- Madmom is more lightweight and suitable for real-time applications, while Spleeter excels in high-quality offline processing
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
Pros of crepe
- Specialized in pitch estimation, offering high accuracy for monophonic audio
- Implements a deep neural network approach, potentially providing better results in complex scenarios
- Provides a command-line interface for easy use and integration
Cons of crepe
- Limited to pitch estimation, while madmom offers a broader range of music information retrieval tasks
- May require more computational resources due to its neural network architecture
- Less extensive documentation compared to madmom
Code Comparison
madmom example (beat tracking):
from madmom.features.beats import RNNBeatProcessor
from madmom.features.beats import DBNBeatTrackingProcessor
proc = DBNBeatTrackingProcessor(fps=100)
act = RNNBeatProcessor()('audio_file.wav')
beats = proc(act)
crepe example (pitch estimation):
import crepe
from scipy.io import wavfile
sr, audio = wavfile.read('audio_file.wav')
time, frequency, confidence, activation = crepe.predict(audio, sr, viterbi=True)
Both libraries offer Python interfaces, but madmom provides a more comprehensive set of tools for various music analysis tasks, while crepe focuses specifically on pitch estimation using a deep learning approach.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
====== madmom
Madmom is an audio signal processing library written in Python with a strong focus on music information retrieval (MIR) tasks.
The library is internally used by the Department of Computational Perception, Johannes Kepler University, Linz, Austria (http://www.cp.jku.at) and the Austrian Research Institute for Artificial Intelligence (OFAI), Vienna, Austria (http://www.ofai.at).
Possible acronyms are:
- Madmom Analyzes Digitized Music Of Musicians
- Mostly Audio / Dominantly Music Oriented Modules
It includes reference implementations for some music information retrieval
algorithms, please see the References
_ section.
Documentation
Documentation of the package can be found online http://madmom.readthedocs.org
License
The package has two licenses, one for source code and one for model/data files.
Source code
Unless indicated otherwise, all source code files are published under the BSD
license. For details, please see the LICENSE <LICENSE>
_ file.
Model and data files
Unless indicated otherwise, all model and data files are distributed under the
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 <http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode>
_ license.
If you want to include any of these files (or a variation or modification
thereof) or technology which utilises them in a commercial product, please
contact Gerhard Widmer <http://www.cp.jku.at/people/widmer/>
_.
Installation
Please do not try to install from the .zip files provided by GitHub. Rather install it from package (if you just want to use it) or source (if you plan to use it for development) by following the instructions below. Whichever variant you choose, please make sure that all prerequisites are installed.
Prerequisites
To install the madmom
package, you must have either Python 2.7 or Python
3.5 or newer and the following packages installed:
numpy <http://www.numpy.org>
_scipy <http://www.scipy.org>
_cython <http://www.cython.org>
_mido <https://github.com/olemb/mido>
_
In order to test your installation, process live audio input, or have improved FFT performance, additionally install these packages:
pytest <https://www.pytest.org/>
_pyaudio <http://people.csail.mit.edu/hubert/pyaudio/>
_pyfftw <https://github.com/pyFFTW/pyFFTW/>
_
If you need support for audio files other than .wav
with a sample rate of
44.1kHz and 16 bit depth, you need ffmpeg
(avconv
on Ubuntu Linux has
some decoding bugs, so we advise not to use it!).
Please refer to the requirements.txt <requirements.txt>
_ file for the minimum
required versions and make sure that these modules are up to date, otherwise it
can result in unexpected errors or false computations!
Install from package
The instructions given here should be used if you just want to install the
package, e.g. to run the bundled programs or use some functionality for your
own project. If you intend to change anything within the madmom
package,
please follow the steps in the next section.
The easiest way to install the package is via pip
from the PyPI (Python Package Index) <https://pypi.python.org/pypi>
_::
pip install madmom
This includes the latest code and trained models and will install all dependencies automatically.
You might need higher privileges (use su or sudo) to install the package, model
files and scripts globally. Alternatively you can install the package locally
(i.e. only for you) by adding the --user
argument::
pip install --user madmom
This will also install the executable programs to a common place (e.g.
/usr/local/bin
), which should be in your $PATH
already. If you
installed the package locally, the programs will be copied to a folder which
might not be included in your $PATH
(e.g. ~/Library/Python/2.7/bin
on Mac OS X or ~/.local/bin
on Ubuntu Linux, pip
will tell you). Thus
the programs need to be called explicitely or you can add their install path
to your $PATH
environment variable::
export PATH='path/to/scripts':$PATH
Install from source
If you plan to use the package as a developer, clone the Git repository::
git clone --recursive https://github.com/CPJKU/madmom.git
Since the pre-trained model/data files are not included in this repository but rather added as a Git submodule, you either have to clone the repo recursively. This is equivalent to these steps::
git clone https://github.com/CPJKU/madmom.git
cd madmom
git submodule update --init --remote
Then you can simply install the package in development mode::
python setup.py develop --user
To run the included tests::
python setup.py pytest
Upgrade of existing installations
To upgrade the package, please use the same mechanism (pip vs. source) as you did for installation. If you want to change from package to source, please uninstall the package first.
Upgrade a package
Simply upgrade the package via pip::
pip install --upgrade madmom [--user]
If some of the provided programs or models changed (please refer to the
CHANGELOG) you should first uninstall the package and then reinstall::
pip uninstall madmom
pip install madmom [--user]
Upgrade from source
Simply pull the latest sources::
git pull
To update the models contained in the submodule::
git submodule update
If any of the .pyx
or .pxd
files changed, you have to recompile the
modules with Cython::
python setup.py build_ext --inplace
Package structure
The package has a very simple structure, divided into the following folders:
/bin <bin>
_
this folder includes example programs (i.e. executable algorithms)
/docs <docs>
_
package documentation
/madmom <madmom>
_
the actual Python package
/madmom/audio <madmom/audio>
_
low level features (e.g. audio file handling, STFT)
/madmom/evaluation <madmom/evaluation>
_
evaluation code
/madmom/features <madmom/features>
_
higher level features (e.g. onsets, beats)
/madmom/ml <madmom/ml>
_
machine learning stuff (e.g. RNNs, HMMs)
/madmom/models <../../../madmom_models>
_
pre-trained model/data files (see the License section)
/madmom/utils <madmom/utils>
_
misc stuff (e.g. MIDI and general file handling)
/tests <tests>
_
tests
Executable programs
The package includes executable programs in the /bin <bin>
_ folder.
If you installed the package, they were copied to a common place.
All scripts can be run in different modes: in single
file mode to process
a single audio file and write the output to STDOUT or the given output file::
DBNBeatTracker single [-o OUTFILE] INFILE
If multiple audio files should be processed, the scripts can also be run in
batch
mode to write the outputs to files with the given suffix::
DBNBeatTracker batch [-o OUTPUT_DIR] [-s OUTPUT_SUFFIX] FILES
If no output directory is given, the program writes the output files to the same location as the audio files.
Some programs can also be run in online
mode, i.e. operate on live audio
signals. This requires pyaudio <http://people.csail.mit.edu/hubert/pyaudio/>
_
to be installed::
DBNBeatTracker online [-o OUTFILE] [INFILE]
The pickle
mode can be used to store the used parameters to be able to
exactly reproduce experiments.
Please note that the program itself as well as the modes have help messages::
DBNBeatTracker -h
DBNBeatTracker single -h
DBNBeatTracker batch -h
DBNBeatTracker online -h
DBNBeatTracker pickle -h
will give different help messages.
Additional resources
Mailing list
The mailing list <https://groups.google.com/d/forum/madmom-users>
_ should be
used to get in touch with the developers and other users.
Wiki
The wiki can be found here: https://github.com/CPJKU/madmom/wiki
FAQ
Frequently asked questions can be found here: https://github.com/CPJKU/madmom/wiki/FAQ
Citation
If you use madmom in your work, please consider citing it:
.. code-block:: latex
@inproceedings{madmom, Title = {{madmom: a new Python Audio and Music Signal Processing Library}}, Author = {B{"o}ck, Sebastian and Korzeniowski, Filip and Schl{"u}ter, Jan and Krebs, Florian and Widmer, Gerhard}, Booktitle = {Proceedings of the 24th ACM International Conference on Multimedia}, Month = {10}, Year = {2016}, Pages = {1174--1178}, Address = {Amsterdam, The Netherlands}, Doi = {10.1145/2964284.2973795} }
References
.. [1] Florian Eyben, Sebastian Böck, Björn Schuller and Alex Graves, Universal Onset Detection with bidirectional Long Short-Term Memory Neural Networks, Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), 2010. .. [2] Sebastian Böck and Markus Schedl, Enhanced Beat Tracking with Context-Aware Neural Networks, Proceedings of the 14th International Conference on Digital Audio Effects (DAFx), 2011. .. [3] Sebastian Böck and Markus Schedl, Polyphonic Piano Note Transcription with Recurrent Neural Networks, Proceedings of the 37th International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012. .. [4] Sebastian Böck, Andreas Arzt, Florian Krebs and Markus Schedl, Online Real-time Onset Detection with Recurrent Neural Networks, Proceedings of the 15th International Conference on Digital Audio Effects (DAFx), 2012. .. [5] Sebastian Böck, Florian Krebs and Markus Schedl, Evaluating the Online Capabilities of Onset Detection Methods, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2012. .. [6] Sebastian Böck and Gerhard Widmer, Maximum Filter Vibrato Suppression for Onset Detection, Proceedings of the 16th International Conference on Digital Audio Effects (DAFx), 2013. .. [7] Sebastian Böck and Gerhard Widmer, Local Group Delay based Vibrato and Tremolo Suppression for Onset Detection, Proceedings of the 13th International Society for Music Information Retrieval Conference (ISMIR), 2013. .. [8] Florian Krebs, Sebastian Böck and Gerhard Widmer, Rhythmic Pattern Modelling for Beat and Downbeat Tracking in Musical Audio, Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR), 2013. .. [9] Sebastian Böck, Jan Schlüter and Gerhard Widmer, Enhanced Peak Picking for Onset Detection with Recurrent Neural Networks, Proceedings of the 6th International Workshop on Machine Learning and Music (MML), 2013. .. [10] Sebastian Böck, Florian Krebs and Gerhard Widmer, A Multi-Model Approach to Beat Tracking Considering Heterogeneous Music Styles, Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), 2014. .. [11] Filip Korzeniowski, Sebastian Böck and Gerhard Widmer, Probabilistic Extraction of Beat Positions from a Beat Activation Function, Proceedings of the 15th International Society for Music Information Retrieval Conference (ISMIR), 2014. .. [12] Sebastian Böck, Florian Krebs and Gerhard Widmer, Accurate Tempo Estimation based on Recurrent Neural Networks and Resonating Comb Filters, Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), 2015. .. [13] Florian Krebs, Sebastian Böck and Gerhard Widmer, An Efficient State Space Model for Joint Tempo and Meter Tracking, Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), 2015. .. [14] Sebastian Böck, Florian Krebs and Gerhard Widmer, Joint Beat and Downbeat Tracking with Recurrent Neural Networks, Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016. .. [15] Filip Korzeniowski and Gerhard Widmer, Feature Learning for Chord Recognition: The Deep Chroma Extractor, Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016. .. [16] Florian Krebs, Sebastian Böck, Matthias Dorfer and Gerhard Widmer, Downbeat Tracking Using Beat-Synchronous Features and Recurrent Networks, Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR), 2016. .. [17] Filip Korzeniowski and Gerhard Widmer, A Fully Convolutional Deep Auditory Model for Musical Chord Recognition, Proceedings of IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2016. .. [18] Filip Korzeniowski and Gerhard Widmer, Genre-Agnostic Key Classification with Convolutional Neural Networks, Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018. .. [19] Rainer Kelz, Sebastian Böck and Gerhard Widmer, Deep Polyphonic ADSR Piano Note Transcription, Proceedings of the 44th International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019.
Acknowledgements
Supported by the European Commission through the GiantSteps project <http://www.giantsteps-project.eu>
_ (FP7 grant agreement no. 610591) and the
Phenicx project <http://phenicx.upf.edu>
_ (FP7 grant agreement no. 601166)
as well as the Austrian Science Fund (FWF) <https://www.fwf.ac.at>
_ project
Z159.
Top Related Projects
Python library for audio and music analysis
a library for audio and music analysis
C++ library for audio and music analysis, description and synthesis, including Python bindings
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Deezer source separation library including pretrained models.
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot