Convert Figma logo to code with AI

salu133445 logomusegan

An AI for Music Generation

1,826
371
1,826
20

Top Related Projects

19,130

Magenta: Music and Art Generation with Machine Intelligence

Deep learning driven jazz generation using Keras & Theano!

List of articles related to deep learning applied to music

26,043

Deezer source separation library including pretrained models.

Quick Overview

MuseGAN is a deep learning project for symbolic multi-track music generation. It uses Generative Adversarial Networks (GANs) to create multi-instrumental music in MIDI format, focusing on creating coherent and harmonious compositions across multiple tracks simultaneously.

Pros

  • Generates multi-track music with coherent structure and harmony
  • Produces MIDI output, which is easily editable and compatible with various music software
  • Offers control over the generation process through conditional inputs
  • Implements several model variants for different music generation tasks

Cons

  • Requires significant computational resources for training
  • Limited to generating music in specific genres (mainly rock/pop)
  • May produce repetitive or unnatural-sounding patterns in some cases
  • Requires musical knowledge to fine-tune and interpret results effectively

Code Examples

  1. Loading a pre-trained model:
from musegan.model import MuseGAN
model = MuseGAN()
model.load('path/to/pretrained/model')
  1. Generating a new multi-track composition:
import numpy as np
num_bars = 4
num_tracks = 5
z = np.random.normal(0, 1, (1, num_bars, model.z_dim))
generated_music = model.generate(z)
  1. Saving the generated music as a MIDI file:
from musegan.utils.midi_io import write_midi
write_midi(generated_music[0], 'output.mid', tempo=120)

Getting Started

  1. Clone the repository:

    git clone https://github.com/salu133445/musegan.git
    cd musegan
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Download pre-trained models:

    python scripts/download_pretrained.py
    
  4. Generate music:

    from musegan.model import MuseGAN
    from musegan.utils.midi_io import write_midi
    import numpy as np
    
    model = MuseGAN()
    model.load('pretrained_models/default')
    z = np.random.normal(0, 1, (1, 4, model.z_dim))
    generated_music = model.generate(z)
    write_midi(generated_music[0], 'output.mid', tempo=120)
    

This will generate a 4-bar multi-track composition and save it as 'output.mid'.

Competitor Comparisons

19,130

Magenta: Music and Art Generation with Machine Intelligence

Pros of Magenta

  • Broader scope, covering multiple music generation tasks and models
  • Larger community and more active development
  • Better documentation and tutorials for getting started

Cons of Magenta

  • More complex codebase due to its broader scope
  • Steeper learning curve for beginners
  • May require more computational resources for some models

Code Comparison

MuseGAN (main model architecture):

def generator(input_tensor):
    net = tf.layers.dense(input_tensor, 1024, activation=tf.nn.leaky_relu)
    net = tf.layers.dense(net, 4096, activation=tf.nn.leaky_relu)
    net = tf.reshape(net, [-1, 16, 16, 16])
    net = tf.layers.conv2d_transpose(net, 64, 5, strides=2, padding='same')
    net = tf.layers.conv2d_transpose(net, 1, 5, strides=2, padding='same')
    return net

Magenta (MelodyRNN model):

def build_graph(mode, config, sequence_example_file_paths=None):
    model = melody_rnn_model.MelodyRnnModel(config)
    sequence_features = {
        'inputs': tf.FixedLenSequenceFeature([],
                                             dtype=tf.int64,
                                             default_value=0),
        'inputs_lengths': tf.FixedLenSequenceFeature(
            [], dtype=tf.int64, default_value=0)
    }
    return model.build_graph(mode, sequence_features, sequence_example_file_paths)

Deep learning driven jazz generation using Keras & Theano!

Pros of DeepJazz

  • Focused specifically on jazz music generation
  • Simpler architecture, potentially easier to understand and modify
  • Includes a web interface for demo purposes

Cons of DeepJazz

  • Less versatile, limited to jazz genre
  • Older project with fewer recent updates
  • Smaller scale, potentially less sophisticated output

Code Comparison

DeepJazz:

def generate(self):
    xIni = np.random.randint(0, self.n_vocab, size=(1, self.maxlen))
    for i in range(MAX_EPOCHS):
        preds = self.model.predict(xIni, verbose=0)[0]
        next_index = self.sample(preds, temperature=1.0)
        next_char = self.indices_char[next_index]

MuseGAN:

def generate(self, n_bars, condition=None, temperature=1.0):
    generated = np.zeros((self.batch_size, n_bars, self.beat_resolution, self.n_pitches))
    for i in range(n_bars):
        z = np.random.normal(0, 1, (self.batch_size, self.z_dim))
        cond = condition[:, i] if condition is not None else None
        bar = self.generator.predict([z, cond], verbose=0)
        generated[:, i] = bar

Both projects use similar approaches for music generation, employing neural networks to predict and generate musical sequences. DeepJazz focuses on character-level prediction for jazz, while MuseGAN uses a more complex architecture for multi-track music generation across various genres.

List of articles related to deep learning applied to music

Pros of awesome-deep-learning-music

  • Comprehensive collection of resources on deep learning for music
  • Regularly updated with new papers, projects, and tools
  • Covers a wide range of topics including generation, analysis, and classification

Cons of awesome-deep-learning-music

  • No actual implementation or code provided
  • May be overwhelming for beginners due to the large amount of information

Code comparison

musegan provides actual implementation:

from musegan.model import MuseGAN
model = MuseGAN(config)
model.train(train_data, valid_data)

awesome-deep-learning-music is a curated list, so it doesn't contain code:

## Music Generation
- [MusicVAE](https://github.com/magenta/magenta/tree/master/magenta/models/music_vae)
- [MuseGAN](https://github.com/salu133445/musegan)

Summary

musegan is a specific implementation of a music generation model, while awesome-deep-learning-music is a curated list of resources. musegan provides hands-on experience with a particular approach, while awesome-deep-learning-music offers a broader overview of the field. The choice between them depends on whether you're looking for a specific implementation or a comprehensive resource guide.

26,043

Deezer source separation library including pretrained models.

Pros of Spleeter

  • Focused on audio source separation, particularly useful for isolating vocals and instruments
  • Well-documented with clear usage instructions and pre-trained models
  • Actively maintained with regular updates and community support

Cons of Spleeter

  • Limited to source separation tasks, not designed for music generation
  • Requires more computational resources for processing large audio files
  • May introduce artifacts in separated audio, especially with complex mixes

Code Comparison

Spleeter:

from spleeter.separator import Separator

separator = Separator('spleeter:2stems')
separator.separate_to_file('audio_example.mp3', 'output/')

MuseGAN:

from musegan.core import MuseGAN
from musegan.components import NowbarHybrid

musegan = MuseGAN(NowbarHybrid())
musegan.load('pretrained_model.pkl')
samples = musegan.generate(n_bars=8, temperature=1.2)

Key Differences

  • Spleeter focuses on audio separation, while MuseGAN is designed for music generation
  • Spleeter has a simpler API for quick audio processing tasks
  • MuseGAN offers more control over the music generation process with various parameters

Both projects serve different purposes in the music technology domain, with Spleeter being more practical for audio engineering tasks and MuseGAN catering to creative music composition applications.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

MuseGAN

MuseGAN is a project on music generation. In a nutshell, we aim to generate polyphonic music of multiple tracks (instruments). The proposed models are able to generate music either from scratch, or by accompanying a track given a priori by the user.

We train the model with training data collected from Lakh Pianoroll Dataset to generate pop song phrases consisting of bass, drums, guitar, piano and strings tracks.

Sample results are available here.

Important Notes

  • The latest implementation is based on the network architectures presented in BinaryMuseGAN, where the temporal structure is handled by 3D convolutional layers. The advantage of this design is its smaller network size, while the disadvantage is its reduced controllability, e.g., capability of feeding different latent variables for different measures or tracks.
  • The original code we used for running the experiments in the paper can be found in the v1 folder.
  • Looking for a PyTorch version? Check out this repository.

Prerequisites

Below we assume the working directory is the repository root.

Install dependencies

  • Using pipenv (recommended)

    Make sure pipenv is installed. (If not, simply run pip install pipenv.)

    # Install the dependencies
    pipenv install
    # Activate the virtual environment
    pipenv shell
    
  • Using pip

    # Install the dependencies
    pip install -r requirements.txt
    

Prepare training data

The training data is collected from Lakh Pianoroll Dataset (LPD), a new multitrack pianoroll dataset.

# Download the training data
./scripts/download_data.sh
# Store the training data to shared memory
./scripts/process_data.sh

You can also download the training data manually (train_x_lpd_5_phr.npz).

As pianoroll matrices are generally sparse, we store only the indices of nonzero elements and the array shape into a npz file to save space, and later restore the original array. To save some training data data into this format, simply run np.savez_compressed("data.npz", shape=data.shape, nonzero=data.nonzero())

Scripts

We provide several shell scripts for easy managing the experiments. (See here for a detailed documentation.)

Below we assume the working directory is the repository root.

Train a new model

  1. Run the following command to set up a new experiment with default settings.

    # Set up a new experiment
    ./scripts/setup_exp.sh "./exp/my_experiment/" "Some notes on my experiment"
    
  2. Modify the configuration and model parameter files for experimental settings.

  3. You can either train the model:

    # Train the model
    ./scripts/run_train.sh "./exp/my_experiment/" "0"
    

    or run the experiment (training + inference + interpolation):

    # Run the experiment
    ./scripts/run_exp.sh "./exp/my_experiment/" "0"
    

Collect training data

Run the following command to collect training data from MIDI files.

# Collect training data
./scripts/collect_data.sh "./midi_dir/" "data/train.npy"

Use pretrained models

  1. Download pretrained models

    # Download the pretrained models
    ./scripts/download_models.sh
    

    You can also download the pretrained models manually (pretrained_models.tar.gz).

  2. You can either perform inference from a trained model:

    # Run inference from a pretrained model
    ./scripts/run_inference.sh "./exp/default/" "0"
    

    or perform interpolation from a trained model:

    # Run interpolation from a pretrained model
    ./scripts/run_interpolation.sh "./exp/default/" "0"
    

Outputs

By default, samples will be generated alongside the training. You can disable this behavior by setting save_samples_steps to zero in the configuration file (config.yaml). The generated will be stored in the following three formats by default.

  • .npy: raw numpy arrays
  • .png: image files
  • .npz: multitrack pianoroll files that can be loaded by the Pypianoroll package

You can disable saving in a specific format by setting save_array_samples, save_image_samples and save_pianoroll_samples to False in the configuration file.

The generated pianorolls are stored in .npz format to save space and processing time. You can use the following code to write them into MIDI files.

from pypianoroll import Multitrack

m = Multitrack('./test.npz')
m.write('./test.mid')

Sample Results

Some sample results can be found in ./exp/ directory. More samples can be downloaded from the following links.

Citing

Please cite the following paper if you use the code provided in this repository.

Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang and Yi-Hsuan Yang, "MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment," AAAI Conference on Artificial Intelligence (AAAI), 2018. (*equal contribution)
[homepage] [arXiv] [paper] [slides] [code]

Papers

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang and Yi-Hsuan Yang (*equal contribution)
AAAI Conference on Artificial Intelligence (AAAI), 2018.
[homepage] [arXiv] [paper] [slides] [code]

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation
Hao-Wen Dong and Yi-Hsuan Yang
International Society for Music Information Retrieval Conference (ISMIR), 2018.
[homepage] [video] [paper] [slides] [slides (long)] [poster] [arXiv] [code]

MuseGAN: Demonstration of a Convolutional GAN Based Model for Generating Multi-track Piano-rolls
Hao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang and Yi-Hsuan Yang (*equal contribution)
ISMIR Late-Breaking Demos, 2017.
[paper] [poster]