pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Top Related Projects
Python library for audio and music analysis
a library for audio and music analysis
C++ library for audio and music analysis, description and synthesis, including Python bindings
Python audio and music signal processing library
Magenta: Music and Art Generation with Machine Intelligence
Quick Overview
pyAudioAnalysis is a Python library for audio feature extraction, classification, segmentation, and application development. It provides a wide range of audio analysis functionalities, including speech recognition, music information retrieval, and audio event detection. The library is designed to be user-friendly and suitable for both researchers and developers working with audio data.
Pros
- Comprehensive set of audio analysis tools and features
- Easy-to-use API with well-documented functions
- Supports various audio file formats and real-time audio processing
- Includes pre-trained models for common audio classification tasks
Cons
- Dependency on older versions of some libraries, which may cause compatibility issues
- Limited support for deep learning-based audio analysis techniques
- Some features may require additional setup or external dependencies
- Documentation could be more extensive for advanced use cases
Code Examples
- Feature extraction from an audio file:
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import ShortTermFeatures
[Fs, x] = audioBasicIO.read_audio_file("sample.wav")
F, f_names = ShortTermFeatures.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)
- Audio classification:
from pyAudioAnalysis import audioTrainTest as aT
aT.file_classification("unknown_audio.wav", "model_file", "svm")
- Audio segmentation:
from pyAudioAnalysis import audioSegmentation as aS
[flags, classes, segments] = aS.mid_term_file_classification("long_audio.wav", "model_file", "svm", True)
Getting Started
To get started with pyAudioAnalysis, follow these steps:
-
Install the library using pip:
pip install pyAudioAnalysis
-
Import the necessary modules in your Python script:
from pyAudioAnalysis import audioBasicIO from pyAudioAnalysis import ShortTermFeatures from pyAudioAnalysis import audioTrainTest as aT from pyAudioAnalysis import audioSegmentation as aS
-
Use the library's functions to analyze your audio files or streams. For example, to extract features from an audio file:
[Fs, x] = audioBasicIO.read_audio_file("your_audio_file.wav") F, f_names = ShortTermFeatures.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)
For more detailed instructions and examples, refer to the library's documentation and GitHub repository.
Competitor Comparisons
Python library for audio and music analysis
Pros of librosa
- More comprehensive and feature-rich library for music and audio analysis
- Better documentation and community support
- Actively maintained with frequent updates
Cons of librosa
- Steeper learning curve for beginners
- May be overkill for simple audio processing tasks
- Slower performance for some operations compared to pyAudioAnalysis
Code Comparison
pyAudioAnalysis:
import audioBasicIO
import audioFeatureExtraction
[Fs, x] = audioBasicIO.read_audio_file("sample.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)
librosa:
import librosa
y, sr = librosa.load("sample.wav")
mfccs = librosa.feature.mfcc(y=y, sr=sr)
chroma = librosa.feature.chroma_stft(y=y, sr=sr)
Both libraries offer audio file reading and feature extraction capabilities. pyAudioAnalysis provides a more straightforward approach for basic audio analysis, while librosa offers more advanced and customizable features for music and audio processing.
a library for audio and music analysis
Pros of aubio
- Written in C, offering better performance for low-level audio processing tasks
- Provides a wider range of audio analysis algorithms, including pitch detection, onset detection, and beat tracking
- Offers bindings for multiple programming languages, including Python, making it versatile for different development environments
Cons of aubio
- Steeper learning curve due to its lower-level implementation and more complex API
- Less focus on machine learning-based audio analysis compared to pyAudioAnalysis
- Requires compilation and may have more complex setup process, especially on certain platforms
Code Comparison
pyAudioAnalysis example:
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import audioFeatureExtraction
[Fs, x] = audioBasicIO.read_audio_file("example.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)
aubio example:
import aubio
source = aubio.source("example.wav")
pitch_o = aubio.pitch("yin", samplerate=source.samplerate)
pitches = []
for frame in source:
pitch = pitch_o(frame)[0]
pitches.append(pitch)
C++ library for audio and music analysis, description and synthesis, including Python bindings
Pros of essentia
- More comprehensive library with a wider range of audio analysis algorithms
- Better performance and efficiency, especially for large-scale audio processing
- Supports multiple programming languages (C++, Python, JavaScript)
Cons of essentia
- Steeper learning curve due to its complexity and extensive feature set
- Requires more setup and configuration compared to pyAudioAnalysis
- Less focused on high-level audio analysis tasks
Code comparison
pyAudioAnalysis:
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import audioFeatureExtraction
[Fs, x] = audioBasicIO.read_audio_file("example.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)
essentia:
import essentia.standard as es
audio = es.MonoLoader(filename='example.wav')()
w = es.Windowing(type='hann')
spectrum = es.Spectrum()
mfcc = es.MFCC()
for frame in es.FrameGenerator(audio, frameSize=1024, hopSize=512):
mfcc_bands, mfcc_coeffs = mfcc(spectrum(w(frame)))
Both libraries offer audio analysis capabilities, but essentia provides more flexibility and advanced features at the cost of increased complexity. pyAudioAnalysis is more straightforward for basic audio analysis tasks, while essentia is better suited for more complex and large-scale audio processing projects.
Python audio and music signal processing library
Pros of madmom
- More focused on music information retrieval (MIR) tasks
- Offers deep learning-based algorithms for advanced audio analysis
- Provides specialized tools for beat tracking and tempo estimation
Cons of madmom
- Steeper learning curve due to its more specialized nature
- Less comprehensive documentation compared to pyAudioAnalysis
- Fewer general-purpose audio analysis features
Code Comparison
madmom example:
from madmom.features.beats import RNNBeatProcessor
processor = RNNBeatProcessor()
beats = processor(audio_file)
pyAudioAnalysis example:
from pyAudioAnalysis import audioBasicIO as aIO
from pyAudioAnalysis import ShortTermFeatures as aF
[Fs, x] = aIO.read_audio_file(audio_file)
F, f_names = aF.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)
madmom is more specialized for music-related tasks, offering advanced algorithms for beat detection and tempo estimation. It's particularly useful for MIR applications but may have a steeper learning curve.
pyAudioAnalysis provides a broader range of audio analysis features and is generally easier to use for beginners. It's more suitable for general-purpose audio analysis tasks but may lack some of the advanced music-specific functionalities offered by madmom.
Choose madmom for music-focused projects requiring advanced MIR capabilities, and pyAudioAnalysis for general audio analysis tasks or when ease of use is a priority.
Magenta: Music and Art Generation with Machine Intelligence
Pros of Magenta
- Broader scope, covering music and art generation with machine learning
- More active development and larger community support
- Extensive documentation and tutorials for various use cases
Cons of Magenta
- Steeper learning curve due to its complexity and breadth
- Requires more computational resources for many tasks
- Less focused on audio analysis compared to pyAudioAnalysis
Code Comparison
pyAudioAnalysis example:
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import audioFeatureExtraction
[Fs, x] = audioBasicIO.read_audio_file("example.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)
Magenta example:
import note_seq
from note_seq.protobuf import music_pb2
sequence = music_pb2.NoteSequence()
note = sequence.notes.add()
note.start_time = 0
note.end_time = 0.5
note.pitch = 60
note.velocity = 80
Both repositories offer powerful tools for audio and music processing, but they serve different purposes. pyAudioAnalysis focuses on audio feature extraction and classification, while Magenta provides a broader set of tools for music generation and creative applications using machine learning.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
A Python library for audio feature extraction, classification, segmentation and applications
This is general info. Click here for the complete wiki and here for a more generic intro to audio data handling
News
- [2022-01-01] If you are not interested in training audio models from your own data, you can check the Deep Audio API, were you can directly send audio data and receive predictions with regards to the respective audio content (speech vs silence, musical genre, speaker gender, etc).
- [2021-08-06] deep-audio-features deep audio classification and feature extraction using CNNs and Pytorch
- Check out paura a Python script for realtime recording and analysis of audio data
General
pyAudioAnalysis is a Python library covering a wide range of audio analysis tasks. Through pyAudioAnalysis you can:
- Extract audio features and representations (e.g. mfccs, spectrogram, chromagram)
- Train, parameter tune and evaluate classifiers of audio segments
- Classify unknown sounds
- Detect audio events and exclude silence periods from long recordings
- Perform supervised segmentation (joint segmentation - classification)
- Perform unsupervised segmentation (e.g. speaker diarization) and extract audio thumbnails
- Train and use audio regression models (example application: emotion recognition)
- Apply dimensionality reduction to visualize audio data and content similarities
Installation
- Clone the source of this library:
git clone https://github.com/tyiannak/pyAudioAnalysis.git
- Install dependencies:
pip install -r ./requirements.txt
- Install using pip:
pip install -e .
An audio classification example
More examples and detailed tutorials can be found at the wiki
pyAudioAnalysis provides easy-to-call wrappers to execute audio analysis tasks. Eg, this code first trains an audio segment classifier, given a set of WAV files stored in folders (each folder representing a different class) and then the trained classifier is used to classify an unknown audio WAV file
from pyAudioAnalysis import audioTrainTest as aT
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")
Result: (0.0, array([ 0.90156761, 0.09843239]), ['music', 'speech'])
In addition, command-line support is provided for all functionalities. E.g. the following command extracts the spectrogram of an audio signal stored in a WAV file: python audioAnalysis.py fileSpectrogram -i data/doremi.wav
Further reading
Apart from this README file, to bettern understand how to use this library one should read the following:
- Audio Handling Basics: Process Audio Files In Command-Line or Python, if you want to learn how to handle audio files from command line, and some basic programming on audio signal processing. Start with that if you don't know anything about audio.
- Intro to Audio Analysis: Recognizing Sounds Using Machine Learning This goes a bit deeper than the previous article, by providing a complete intro to theory and practice of audio feature extraction, classification and segmentation (includes many Python examples).
- The library's wiki
- How to Use Machine Learning to Color Your Lighting Based on Music Mood. An interesting use-case of using this lib to train a real-time music mood estimator.
- A more general and theoretic description of the adopted methods (along with several experiments on particular use-cases) is presented in this publication. Please use the following citation when citing pyAudioAnalysis in your research work:
@article{giannakopoulos2015pyaudioanalysis,
title={pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis},
author={Giannakopoulos, Theodoros},
journal={PloS one},
volume={10},
number={12},
year={2015},
publisher={Public Library of Science}
}
For Matlab-related audio analysis material check this book.
Author
Theodoros Giannakopoulos, Principal Researcher of Multimodal Machine Learning at the Multimedia Analysis Group of the Computational Intelligence Lab (MagCIL) of the Institute of Informatics and Telecommunications, of the National Center for Scientific Research "Demokritos"
Top Related Projects
Python library for audio and music analysis
a library for audio and music analysis
C++ library for audio and music analysis, description and synthesis, including Python bindings
Python audio and music signal processing library
Magenta: Music and Art Generation with Machine Intelligence
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot