pyAudioAnalysis

Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

6,099

1,220

6,099

205

View on GitHub

Top Related Projects

librosa

7,777

Python library for audio and music analysis

aubio

3,478

a library for audio and music analysis

essentia

3,106

C++ library for audio and music analysis, description and synthesis, including Python bindings

madmom

1,452

Python audio and music signal processing library

magenta

19,599

Magenta: Music and Art Generation with Machine Intelligence

Quick Overview

pyAudioAnalysis is a Python library for audio feature extraction, classification, segmentation, and application development. It provides a wide range of audio analysis functionalities, including speech recognition, music information retrieval, and audio event detection. The library is designed to be user-friendly and suitable for both researchers and developers working with audio data.

Pros

Comprehensive set of audio analysis tools and features
Easy-to-use API with well-documented functions
Supports various audio file formats and real-time audio processing
Includes pre-trained models for common audio classification tasks

Cons

Dependency on older versions of some libraries, which may cause compatibility issues
Limited support for deep learning-based audio analysis techniques
Some features may require additional setup or external dependencies
Documentation could be more extensive for advanced use cases

Code Examples

Feature extraction from an audio file:

from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import ShortTermFeatures

[Fs, x] = audioBasicIO.read_audio_file("sample.wav")
F, f_names = ShortTermFeatures.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)

Audio classification:

from pyAudioAnalysis import audioTrainTest as aT

aT.file_classification("unknown_audio.wav", "model_file", "svm")

Audio segmentation:

from pyAudioAnalysis import audioSegmentation as aS

[flags, classes, segments] = aS.mid_term_file_classification("long_audio.wav", "model_file", "svm", True)

Getting Started

To get started with pyAudioAnalysis, follow these steps:

Install the library using pip:
```
pip install pyAudioAnalysis
```

Import the necessary modules in your Python script:

from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import ShortTermFeatures
from pyAudioAnalysis import audioTrainTest as aT
from pyAudioAnalysis import audioSegmentation as aS

Use the library's functions to analyze your audio files or streams. For example, to extract features from an audio file:

[Fs, x] = audioBasicIO.read_audio_file("your_audio_file.wav")
F, f_names = ShortTermFeatures.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)

For more detailed instructions and examples, refer to the library's documentation and GitHub repository.

Competitor Comparisons

librosa

7,777

Python library for audio and music analysis

Pros of librosa

More comprehensive and feature-rich library for music and audio analysis
Better documentation and community support
Actively maintained with frequent updates

Cons of librosa

Steeper learning curve for beginners
May be overkill for simple audio processing tasks
Slower performance for some operations compared to pyAudioAnalysis

Code Comparison

pyAudioAnalysis:

import audioBasicIO
import audioFeatureExtraction

[Fs, x] = audioBasicIO.read_audio_file("sample.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)

librosa:

import librosa

y, sr = librosa.load("sample.wav")
mfccs = librosa.feature.mfcc(y=y, sr=sr)
chroma = librosa.feature.chroma_stft(y=y, sr=sr)

Both libraries offer audio file reading and feature extraction capabilities. pyAudioAnalysis provides a more straightforward approach for basic audio analysis, while librosa offers more advanced and customizable features for music and audio processing.

aubio

3,478

a library for audio and music analysis

Pros of aubio

Written in C, offering better performance for low-level audio processing tasks
Provides a wider range of audio analysis algorithms, including pitch detection, onset detection, and beat tracking
Offers bindings for multiple programming languages, including Python, making it versatile for different development environments

Cons of aubio

Steeper learning curve due to its lower-level implementation and more complex API
Less focus on machine learning-based audio analysis compared to pyAudioAnalysis
Requires compilation and may have more complex setup process, especially on certain platforms

Code Comparison

pyAudioAnalysis example:

from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import audioFeatureExtraction

[Fs, x] = audioBasicIO.read_audio_file("example.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)

aubio example:

import aubio

source = aubio.source("example.wav")
pitch_o = aubio.pitch("yin", samplerate=source.samplerate)

pitches = []
for frame in source:
    pitch = pitch_o(frame)[0]
    pitches.append(pitch)

essentia

3,106

C++ library for audio and music analysis, description and synthesis, including Python bindings

Pros of essentia

More comprehensive library with a wider range of audio analysis algorithms
Better performance and efficiency, especially for large-scale audio processing
Supports multiple programming languages (C++, Python, JavaScript)

Cons of essentia

Steeper learning curve due to its complexity and extensive feature set
Requires more setup and configuration compared to pyAudioAnalysis
Less focused on high-level audio analysis tasks

Code comparison

pyAudioAnalysis:

from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import audioFeatureExtraction

[Fs, x] = audioBasicIO.read_audio_file("example.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)

essentia:

import essentia.standard as es

audio = es.MonoLoader(filename='example.wav')()
w = es.Windowing(type='hann')
spectrum = es.Spectrum()
mfcc = es.MFCC()

for frame in es.FrameGenerator(audio, frameSize=1024, hopSize=512):
    mfcc_bands, mfcc_coeffs = mfcc(spectrum(w(frame)))

Both libraries offer audio analysis capabilities, but essentia provides more flexibility and advanced features at the cost of increased complexity. pyAudioAnalysis is more straightforward for basic audio analysis tasks, while essentia is better suited for more complex and large-scale audio processing projects.

madmom

1,452

Python audio and music signal processing library

Pros of madmom

More focused on music information retrieval (MIR) tasks
Offers deep learning-based algorithms for advanced audio analysis
Provides specialized tools for beat tracking and tempo estimation

Cons of madmom

Steeper learning curve due to its more specialized nature
Less comprehensive documentation compared to pyAudioAnalysis
Fewer general-purpose audio analysis features

Code Comparison

madmom example:

from madmom.features.beats import RNNBeatProcessor
processor = RNNBeatProcessor()
beats = processor(audio_file)

pyAudioAnalysis example:

from pyAudioAnalysis import audioBasicIO as aIO
from pyAudioAnalysis import ShortTermFeatures as aF
[Fs, x] = aIO.read_audio_file(audio_file)
F, f_names = aF.feature_extraction(x, Fs, 0.050*Fs, 0.025*Fs)

madmom is more specialized for music-related tasks, offering advanced algorithms for beat detection and tempo estimation. It's particularly useful for MIR applications but may have a steeper learning curve.

pyAudioAnalysis provides a broader range of audio analysis features and is generally easier to use for beginners. It's more suitable for general-purpose audio analysis tasks but may lack some of the advanced music-specific functionalities offered by madmom.

Choose madmom for music-focused projects requiring advanced MIR capabilities, and pyAudioAnalysis for general audio analysis tasks or when ease of use is a priority.

magenta

19,599

Magenta: Music and Art Generation with Machine Intelligence

Pros of Magenta

Broader scope, covering music and art generation with machine learning
More active development and larger community support
Extensive documentation and tutorials for various use cases

Cons of Magenta

Steeper learning curve due to its complexity and breadth
Requires more computational resources for many tasks
Less focused on audio analysis compared to pyAudioAnalysis

Code Comparison

pyAudioAnalysis example:

from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import audioFeatureExtraction
[Fs, x] = audioBasicIO.read_audio_file("example.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)

Magenta example:

import note_seq
from note_seq.protobuf import music_pb2

sequence = music_pb2.NoteSequence()
note = sequence.notes.add()
note.start_time = 0
note.end_time = 0.5
note.pitch = 60
note.velocity = 80

Both repositories offer powerful tools for audio and music processing, but they serve different purposes. pyAudioAnalysis focuses on audio feature extraction and classification, while Magenta provides a broader set of tools for music generation and creative applications using machine learning.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

A Python library for audio feature extraction, classification, segmentation and applications

This is general info. Click here for the complete wiki and here for a more generic intro to audio data handling

News

[2025-03-29] Check Behavioral Signals Python SDK that demonstrates how to use Behavioral Signals' API to send speech data and retrieve predictions related to emotions and behaviors using Python code. It works in both batch and streaming mode. Now Behavioral Signals' API also supports a Speaker Agnostic Deep Fake Detector.
[2021-08-06] deep-audio-features deep audio classification and feature extraction using CNNs and Pytorch
Check out paura a Python script for realtime recording and analysis of audio data

General

pyAudioAnalysis is a Python library covering a wide range of audio analysis tasks. Through pyAudioAnalysis you can:

Extract audio features and representations (e.g. mfccs, spectrogram, chromagram)
Train, parameter tune and evaluate classifiers of audio segments
Classify unknown sounds
Detect audio events and exclude silence periods from long recordings
Perform supervised segmentation (joint segmentation - classification)
Perform unsupervised segmentation (e.g. speaker diarization) and extract audio thumbnails
Train and use audio regression models (example application: emotion recognition)
Apply dimensionality reduction to visualize audio data and content similarities

Installation

Clone the source of this library: git clone https://github.com/tyiannak/pyAudioAnalysis.git
Install dependencies: pip install -r ./requirements.txt
Install using pip: pip install -e .

An audio classification example

More examples and detailed tutorials can be found at the wiki

pyAudioAnalysis provides easy-to-call wrappers to execute audio analysis tasks. Eg, this code first trains an audio segment classifier, given a set of WAV files stored in folders (each folder representing a different class) and then the trained classifier is used to classify an unknown audio WAV file

from pyAudioAnalysis import audioTrainTest as aT
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")

Result: (0.0, array([ 0.90156761, 0.09843239]), ['music', 'speech'])

In addition, command-line support is provided for all functionalities. E.g. the following command extracts the spectrogram of an audio signal stored in a WAV file: python audioAnalysis.py fileSpectrogram -i data/doremi.wav

Author

Theodoros Giannakopoulos, Principal Researcher of Multimodal Machine Learning at the Multimedia Analysis Group of the Computational Intelligence Lab (MagCIL) of the Institute of Informatics and Telecommunications, of the National Center for Scientific Research "Demokritos"

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of librosa

Cons of librosa

Code Comparison

Pros of aubio

Cons of aubio

Code Comparison

Pros of essentia

Cons of essentia

Code comparison

Pros of madmom

Cons of madmom

Code Comparison

Pros of Magenta

Cons of Magenta

Code Comparison

Convert designs to code with AI

README

A Python library for audio feature extraction, classification, segmentation and applications

News

General

Installation

An audio classification example

Further reading

Author

Top Related Projects

Convert designs to code with AI