Top Related Projects
Python library for audio and music analysis
a library for audio and music analysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
C++ library for audio and music analysis, description and synthesis, including Python bindings
Quick Overview
Pydub is a Python library for audio processing. It provides a simple and intuitive interface for manipulating audio files, including operations like cutting, concatenating, and applying effects. Pydub supports various audio formats and can handle both reading and writing of audio files.
Pros
- Easy-to-use API for audio manipulation
- Supports multiple audio formats (mp3, wav, ogg, etc.)
- Cross-platform compatibility
- No external dependencies for basic operations
Cons
- Limited advanced audio processing capabilities
- Requires external libraries for certain formats (e.g., ffmpeg for mp3)
- Performance may be slower compared to lower-level audio libraries
- Documentation could be more comprehensive
Code Examples
- Loading and exporting audio:
from pydub import AudioSegment
# Load an MP3 file
audio = AudioSegment.from_mp3("input.mp3")
# Export as WAV
audio.export("output.wav", format="wav")
- Slicing and concatenating audio:
from pydub import AudioSegment
# Load two audio files
audio1 = AudioSegment.from_wav("file1.wav")
audio2 = AudioSegment.from_wav("file2.wav")
# Slice the first 5 seconds of audio1
slice = audio1[:5000]
# Concatenate the slice with audio2
combined = slice + audio2
# Export the result
combined.export("combined.wav", format="wav")
- Applying effects:
from pydub import AudioSegment
from pydub.effects import normalize, speedup
# Load audio
audio = AudioSegment.from_wav("input.wav")
# Normalize the audio
normalized = normalize(audio)
# Speed up the audio by 1.5x
faster = speedup(normalized, playback_speed=1.5)
# Export the result
faster.export("processed.wav", format="wav")
Getting Started
To get started with Pydub, follow these steps:
-
Install Pydub using pip:
pip install pydub
-
If you need to work with mp3 files, install ffmpeg:
- On macOS:
brew install ffmpeg
- On Ubuntu:
sudo apt-get install ffmpeg
- On Windows: Download from https://ffmpeg.org/download.html
- On macOS:
-
Import Pydub in your Python script:
from pydub import AudioSegment
-
Start manipulating audio:
audio = AudioSegment.from_mp3("example.mp3") sliced_audio = audio[:10000] # First 10 seconds sliced_audio.export("output.wav", format="wav")
Competitor Comparisons
Python library for audio and music analysis
Pros of librosa
- More comprehensive audio analysis features, including spectral analysis and music information retrieval
- Better suited for machine learning and signal processing tasks
- Extensive documentation and scientific backing
Cons of librosa
- Steeper learning curve due to its more complex functionality
- Slower performance for basic audio manipulation tasks
- Requires additional dependencies like NumPy and SciPy
Code Comparison
librosa:
import librosa
y, sr = librosa.load('audio.wav')
tempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)
mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
pydub:
from pydub import AudioSegment
audio = AudioSegment.from_wav("audio.wav")
louder_audio = audio + 6
reversed_audio = audio.reverse()
Summary
librosa is better suited for advanced audio analysis and machine learning tasks, while pydub excels in simple audio manipulation and editing. librosa offers more comprehensive features but has a steeper learning curve, whereas pydub is more user-friendly for basic operations but lacks advanced analytical capabilities.
a library for audio and music analysis
Pros of aubio
- More comprehensive audio analysis capabilities, including pitch detection, onset detection, and beat tracking
- Written in C, offering potentially better performance for low-level audio processing tasks
- Provides command-line tools for audio analysis in addition to the library
Cons of aubio
- Steeper learning curve due to its more complex API and broader feature set
- Less straightforward for simple audio manipulation tasks compared to pydub
- Requires compilation and may have more complex installation process
Code comparison
aubio:
import aubio
# Create a pitch detection object
pitch_o = aubio.pitch("default", 2048, 512, 44100)
# Process audio and get pitch
pitch = pitch_o(audio_samples)[0]
pydub:
from pydub import AudioSegment
# Load and manipulate audio
audio = AudioSegment.from_wav("input.wav")
louder_audio = audio + 6 # Increase volume by 6dB
aubio focuses on advanced audio analysis, offering powerful tools for tasks like pitch detection and beat tracking. It's written in C for performance but may be more complex to use. pydub, on the other hand, provides a simpler interface for basic audio manipulation tasks, making it more accessible for beginners and straightforward audio editing.
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Pros of pyAudioAnalysis
- More comprehensive audio analysis features, including feature extraction, classification, and segmentation
- Includes machine learning capabilities for audio classification and regression tasks
- Offers visualization tools for audio signal analysis
Cons of pyAudioAnalysis
- Steeper learning curve due to more complex functionality
- Less focused on audio file manipulation and conversion
- Requires additional dependencies for certain features
Code Comparison
pyAudioAnalysis:
from pyAudioAnalysis import audioBasicIO
from pyAudioAnalysis import audioFeatureExtraction
[Fs, x] = audioBasicIO.read_audio_file("example.wav")
F, f_names = audioFeatureExtraction.stFeatureExtraction(x, Fs, 0.050*Fs, 0.025*Fs)
pydub:
from pydub import AudioSegment
sound = AudioSegment.from_wav("example.wav")
louder_sound = sound + 3
louder_sound.export("louder_example.wav", format="wav")
pyAudioAnalysis focuses on advanced audio analysis and feature extraction, while pydub excels in simple audio manipulation tasks. pyAudioAnalysis offers more sophisticated analysis tools but requires more setup and knowledge. pydub provides an easier-to-use interface for basic audio operations but lacks advanced analysis capabilities.
C++ library for audio and music analysis, description and synthesis, including Python bindings
Pros of Essentia
- More comprehensive audio analysis capabilities, including advanced features like pitch detection, beat tracking, and music classification
- Supports a wider range of audio formats and codecs
- Offers both C++ and Python interfaces for better performance and flexibility
Cons of Essentia
- Steeper learning curve due to its more complex API and extensive feature set
- Requires more system dependencies and setup compared to Pydub
- Larger library size, which may impact project size and deployment
Code Comparison
Pydub example (loading and manipulating audio):
from pydub import AudioSegment
sound = AudioSegment.from_mp3("example.mp3")
louder_sound = sound + 6
first_5_seconds = sound[:5000]
Essentia example (loading audio and performing analysis):
import essentia.standard as es
audio = es.MonoLoader(filename="example.mp3")()
spectrum = es.Spectrum()(audio)
mfcc = es.MFCC()(spectrum)
Both libraries offer audio manipulation capabilities, but Essentia provides more advanced analysis features out of the box. Pydub focuses on simplicity and ease of use for basic audio operations, while Essentia offers a wider range of audio processing and analysis tools at the cost of increased complexity.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Pydub
Pydub lets you do stuff to audio in a way that isn't stupid.
Stuff you might be looking for:
Quickstart
Open a WAV file
from pydub import AudioSegment
song = AudioSegment.from_wav("never_gonna_give_you_up.wav")
...or a mp3
song = AudioSegment.from_mp3("never_gonna_give_you_up.mp3")
... or an ogg, or flv, or anything else ffmpeg supports
ogg_version = AudioSegment.from_ogg("never_gonna_give_you_up.ogg")
flv_version = AudioSegment.from_flv("never_gonna_give_you_up.flv")
mp4_version = AudioSegment.from_file("never_gonna_give_you_up.mp4", "mp4")
wma_version = AudioSegment.from_file("never_gonna_give_you_up.wma", "wma")
aac_version = AudioSegment.from_file("never_gonna_give_you_up.aiff", "aac")
Slice audio:
# pydub does things in milliseconds
ten_seconds = 10 * 1000
first_10_seconds = song[:ten_seconds]
last_5_seconds = song[-5000:]
Make the beginning louder and the end quieter
# boost volume by 6dB
beginning = first_10_seconds + 6
# reduce volume by 3dB
end = last_5_seconds - 3
Concatenate audio (add one file to the end of another)
without_the_middle = beginning + end
How long is it?
without_the_middle.duration_seconds == 15.0
AudioSegments are immutable
# song is not modified
backwards = song.reverse()
Crossfade (again, beginning and end are not modified)
# 1.5 second crossfade
with_style = beginning.append(end, crossfade=1500)
Repeat
# repeat the clip twice
do_it_over = with_style * 2
Fade (note that you can chain operations because everything returns an AudioSegment)
# 2 sec fade in, 3 sec fade out
awesome = do_it_over.fade_in(2000).fade_out(3000)
Save the results (again whatever ffmpeg supports)
awesome.export("mashup.mp3", format="mp3")
Save the results with tags (metadata)
awesome.export("mashup.mp3", format="mp3", tags={'artist': 'Various artists', 'album': 'Best of 2011', 'comments': 'This album is awesome!'})
You can pass an optional bitrate argument to export using any syntax ffmpeg supports.
awesome.export("mashup.mp3", format="mp3", bitrate="192k")
Any further arguments supported by ffmpeg can be passed as a list in a 'parameters' argument, with switch first, argument second. Note that no validation takes place on these parameters, and you may be limited by what your particular build of ffmpeg/avlib supports.
# Use preset mp3 quality 0 (equivalent to lame V0)
awesome.export("mashup.mp3", format="mp3", parameters=["-q:a", "0"])
# Mix down to two channels and set hard output volume
awesome.export("mashup.mp3", format="mp3", parameters=["-ac", "2", "-vol", "150"])
Debugging
Most issues people run into are related to converting between formats using ffmpeg/avlib. Pydub provides a logger that outputs the subprocess calls to help you track down issues:
>>> import logging
>>> l = logging.getLogger("pydub.converter")
>>> l.setLevel(logging.DEBUG)
>>> l.addHandler(logging.StreamHandler())
>>> AudioSegment.from_file("./test/data/test1.mp3")
subprocess.call(['ffmpeg', '-y', '-i', '/var/folders/71/42k8g72x4pq09tfp920d033r0000gn/T/tmpeZTgMy', '-vn', '-f', 'wav', '/var/folders/71/42k8g72x4pq09tfp920d033r0000gn/T/tmpK5aLcZ'])
<pydub.audio_segment.AudioSegment object at 0x101b43e10>
Don't worry about the temporary files used in the conversion. They're cleaned up automatically.
Bugs & Questions
You can file bugs in our github issues tracker, and ask any technical questions on Stack Overflow using the pydub tag. We keep an eye on both.
Installation
Installing pydub is easy, but don't forget to install ffmpeg/avlib (the next section in this doc)
pip install pydub
Or install the latest dev version from github (or replace @master
with a release version like @v0.12.0
)â¦
pip install git+https://github.com/jiaaro/pydub.git@master
-OR-
git clone https://github.com/jiaaro/pydub.git
-OR-
Copy the pydub directory into your python path. Zip here
Dependencies
You can open and save WAV files with pure python. For opening and saving non-wav files â like mp3 â you'll need ffmpeg or libav.
Playback
You can play audio if you have one of these installed (simpleaudio strongly recommended, even if you are installing ffmpeg/libav):
- simpleaudio
- pyaudio
- ffplay (usually bundled with ffmpeg, see the next section)
- avplay (usually bundled with libav, see the next section)
from pydub import AudioSegment
from pydub.playback import play
sound = AudioSegment.from_file("mysound.wav", format="wav")
play(sound)
Getting ffmpeg set up
You may use libav or ffmpeg.
Mac (using homebrew):
# libav
brew install libav
#### OR #####
# ffmpeg
brew install ffmpeg
Linux (using aptitude):
# libav
apt-get install libav-tools libavcodec-extra
#### OR #####
# ffmpeg
apt-get install ffmpeg libavcodec-extra
Windows:
- Download and extract libav from Windows binaries provided here.
- Add the libav
/bin
folder to your PATH envvar pip install pydub
Important Notes
AudioSegment
objects are immutable
Ogg exporting and default codecs
The Ogg specification (http://tools.ietf.org/html/rfc5334) does not specify the codec to use, this choice is left up to the user. Vorbis and Theora are just some of a number of potential codecs (see page 3 of the rfc) that can be used for the encapsulated data.
When no codec is specified exporting to ogg
will default to using vorbis
as a convenience. That is:
from pydub import AudioSegment
song = AudioSegment.from_mp3("test/data/test1.mp3")
song.export("out.ogg", format="ogg") # Is the same as:
song.export("out.ogg", format="ogg", codec="libvorbis")
Example Use
Suppose you have a directory filled with mp4 and flv videos and you want to convert all of them to mp3 so you can listen to them on your mp3 player.
import os
import glob
from pydub import AudioSegment
video_dir = '/home/johndoe/downloaded_videos/' # Path where the videos are located
extension_list = ('*.mp4', '*.flv')
os.chdir(video_dir)
for extension in extension_list:
for video in glob.glob(extension):
mp3_filename = os.path.splitext(os.path.basename(video))[0] + '.mp3'
AudioSegment.from_file(video).export(mp3_filename, format='mp3')
How about another example?
from glob import glob
from pydub import AudioSegment
playlist_songs = [AudioSegment.from_mp3(mp3_file) for mp3_file in glob("*.mp3")]
first_song = playlist_songs.pop(0)
# let's just include the first 30 seconds of the first song (slicing
# is done by milliseconds)
beginning_of_song = first_song[:30*1000]
playlist = beginning_of_song
for song in playlist_songs:
# We don't want an abrupt stop at the end, so let's do a 10 second crossfades
playlist = playlist.append(song, crossfade=(10 * 1000))
# let's fade out the end of the last song
playlist = playlist.fade_out(30)
# hmm I wonder how long it is... ( len(audio_segment) returns milliseconds )
playlist_length = len(playlist) / (1000*60)
# lets save it!
with open("%s_minute_playlist.mp3" % playlist_length, 'wb') as out_f:
playlist.export(out_f, format='mp3')
License (MIT License)
Copyright © 2011 James Robert, http://jiaaro.com
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Top Related Projects
Python library for audio and music analysis
a library for audio and music analysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
C++ library for audio and music analysis, description and synthesis, including Python bindings
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot