Convert Figma logo to code with AI

microsoft logomuzic

Muzic: Music Understanding and Generation with Artificial Intelligence

4,498
441
4,498
101

Top Related Projects

19,130

Magenta: Music and Art Generation with Machine Intelligence

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

11,805

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

7,798

Code for the paper "Jukebox: A Generative Model for Music"

1,826

An AI for Music Generation

Quick Overview

Muzic is a deep learning-based music generation system developed by Microsoft Research. It is designed to generate high-quality music compositions by learning from a large corpus of musical data. The project aims to explore the potential of artificial intelligence in the field of music creation.

Pros

  • Innovative Approach: Muzic utilizes advanced deep learning techniques to generate novel and creative musical compositions.
  • Diverse Music Generation: The system is capable of generating music in a variety of genres and styles, showcasing its versatility.
  • Potential for Artistic Collaboration: Muzic could be used as a tool to assist human composers and musicians in the creative process.
  • Open-Source: The project is open-source, allowing for community contributions and further development.

Cons

  • Limited Evaluation: The project's evaluation of the generated music's quality and creativity is not extensively documented.
  • Computational Complexity: The deep learning models used in Muzic may require significant computational resources, limiting its accessibility.
  • Lack of User-Friendly Interface: The project currently lacks a user-friendly interface, making it challenging for non-technical users to interact with.
  • Potential Ethical Concerns: The use of AI-generated music in commercial or artistic contexts may raise ethical questions about authorship and intellectual property.

Code Examples

N/A (This is not a code library)

Getting Started

N/A (This is not a code library)

Competitor Comparisons

19,130

Magenta: Music and Art Generation with Machine Intelligence

Pros of Magenta

  • More mature project with a larger community and longer development history
  • Broader scope, covering various aspects of music and art generation
  • Extensive documentation and tutorials for easier onboarding

Cons of Magenta

  • Less focused on cutting-edge AI music generation techniques
  • May have a steeper learning curve for beginners due to its broader scope
  • Potentially slower development cycle compared to more specialized projects

Code Comparison

Magenta (Python):

import magenta

melody = magenta.music.Melody(
    notes=[60, 62, 64, 65, 67, 69, 71, 72],
    start_step=0,
    steps_per_quarter=4
)

Muzic (Python):

from muzic import MusicGenerator

generator = MusicGenerator()
melody = generator.generate_melody(
    length=8,
    scale='C_major'
)

Both repositories offer tools for music generation, but Magenta provides a more comprehensive set of features for various musical tasks, while Muzic focuses specifically on AI-driven music generation. Magenta's code tends to be more low-level, giving users fine-grained control over musical elements. Muzic, on the other hand, aims for a higher-level interface, potentially making it more accessible for quick prototyping and experimentation with AI-generated music.

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Pros of Audiocraft

  • More comprehensive audio generation capabilities, including music, sound effects, and speech synthesis
  • Offers pre-trained models for immediate use, such as MusicGen and AudioGen
  • Actively maintained with recent updates and contributions

Cons of Audiocraft

  • Focused primarily on audio generation, lacking some music analysis features found in Muzic
  • Requires more computational resources due to its advanced models
  • Steeper learning curve for users new to audio AI technologies

Code Comparison

Audiocraft example (audio generation):

import torchaudio
from audiocraft.models import MusicGen

model = MusicGen.get_pretrained('medium')
wav = model.generate_unconditional(4, progress=True)
torchaudio.save('generated_music.wav', wav[0].cpu(), model.sample_rate)

Muzic example (music analysis):

from muzic import MusicAnalyzer

analyzer = MusicAnalyzer()
features = analyzer.extract_features('song.mp3')
print(features)

Summary

Audiocraft excels in audio generation tasks, offering powerful pre-trained models for various audio synthesis applications. Muzic, on the other hand, provides a broader range of music-specific analysis tools. While Audiocraft is more suitable for creating new audio content, Muzic is better suited for tasks involving music understanding and analysis.

11,805

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Pros of NeMo

  • Broader scope: Covers speech recognition, natural language processing, and text-to-speech, not just music generation
  • More active development: More frequent updates and contributions
  • Better documentation and examples for getting started

Cons of NeMo

  • Steeper learning curve due to its broader scope
  • Requires more computational resources for training and inference
  • Less focused on music-specific tasks compared to Muzic

Code Comparison

Muzic example (symbolic music generation):

from muzic.models import MusicTransformer

model = MusicTransformer.from_pretrained('music-transformer-midi')
generated = model.generate(max_length=512, temperature=1.0)

NeMo example (speech recognition):

import nemo.collections.asr as nemo_asr

asr_model = nemo_asr.models.EncDecCTCModel.from_pretrained("QuartzNet15x5Base-En")
transcription = asr_model.transcribe(["audio_file.wav"])

Both repositories offer high-level APIs for their respective tasks, but NeMo's broader scope is evident in its more diverse set of models and functionalities. Muzic focuses specifically on music-related tasks, while NeMo covers a wider range of audio and speech processing applications.

7,798

Code for the paper "Jukebox: A Generative Model for Music"

Pros of Jukebox

  • More advanced and capable of generating complete songs with vocals
  • Produces higher quality audio output
  • Offers pre-trained models for immediate use

Cons of Jukebox

  • Requires significant computational resources
  • Less focused on music theory and composition
  • Limited customization options for users

Code Comparison

Muzic (Python):

from muzic import MusicGenerator

generator = MusicGenerator()
melody = generator.generate_melody(length=16, scale='C_major')

Jukebox (Python):

import jukebox
from jukebox.sample import sample_partial_window

model = jukebox.load_model('1b_lyrics')
sample = sample_partial_window(model, ...)

Key Differences

  • Muzic focuses on symbolic music generation and analysis
  • Jukebox specializes in raw audio generation, including vocals
  • Muzic offers more tools for music theory and composition
  • Jukebox provides end-to-end audio generation capabilities

Use Cases

Muzic:

  • Music education and theory applications
  • Algorithmic composition and MIDI generation
  • Music analysis and research

Jukebox:

  • AI-generated complete songs with vocals
  • Audio synthesis and sound design
  • Exploring advanced deep learning for music generation
1,826

An AI for Music Generation

Pros of MuseGAN

  • Focused specifically on multi-track music generation
  • Provides pre-trained models for immediate use
  • Includes a comprehensive evaluation metrics suite

Cons of MuseGAN

  • Less actively maintained (last update in 2019)
  • Limited to MIDI format generation
  • Narrower scope compared to Muzic's diverse music AI tasks

Code Comparison

MuseGAN (model definition):

class Generator(nn.Module):
    def __init__(self, n_tracks, n_bars, n_steps_per_bar, n_pitches):
        super().__init__()
        self.z_dim = 32
        self.hidden_dim = 512
        self.n_tracks = n_tracks
        self.n_bars = n_bars
        self.n_steps_per_bar = n_steps_per_bar
        self.n_pitches = n_pitches

Muzic (model usage):

from muzic.models import MusicTransformer

model = MusicTransformer()
generated_music = model.generate(
    prompt="C4 E4 G4",
    max_length=512,
    temperature=0.9
)

Both repositories focus on AI-driven music generation, but Muzic offers a broader range of tools and more recent updates. MuseGAN provides a specialized approach for multi-track generation, while Muzic encompasses various music AI tasks with a more extensive toolkit.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README




Muzic is a research project on AI music that empowers music understanding and generation with deep learning and artificial intelligence. Muzic is pronounced as [ˈmjuːzeik]. Besides the logo in image version (see above), Muzic also has a logo in video version (you can click here to watch ). Muzic was started by some researchers from Microsoft Research Asia and also contributed by outside collaborators.


We summarize the scope of our Muzic project in the following figure:


The current work in Muzic includes:

For more speech related research, you can find from this page: https://speechresearch.github.io/ and https://github.com/microsoft/NeuralSpeech.

We are hiring!

We are hiring both research FTEs and research interns on Speech/Audio/Music/Video and LLMs. Please get in touch with Xu Tan (tanxu2012@gmail.com) if you are interested.

What is New?

  • CLaMP has won the Best Student Paper Award at ISMIR 2023!
  • We release MusicAgent, an AI agent for versatile music processing using large language models.
  • We release MuseCoco, a music composition copilot to generate symbolic music from text.
  • We release GETMusic, a versatile music copliot with a universal representation and diffusion framework to generate any music tracks.
  • We release the first model for cross-modal symbolic MIR: CLaMP.
  • We release two new research work on music structure modeling: MeloForm and Museformer.
  • We give a tutorial on AI Music Composition at ACM Multimedia 2021.

Requirements

The operating system is Linux. We test on Ubuntu 16.04.6 LTS, CUDA 10, with Python 3.6.12. The requirements for running Muzic are listed in requirements.txt. To install the requirements, run:

pip install -r requirements.txt

We release the code of several research work: MusicBERT, PDAugment, CLaMP, DeepRapper, SongMASS, TeleMelody, ReLyMe, Re-creation of Creations (ROC), MeloForm, Museformer, GETMusic, MuseCoco, and MusicAgent. You can find the README in the corresponding folder for detailed instructions on how to use.

Reference

If you find the Muzic project useful in your work, you can cite the papers as follows:

  • [1] MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training, Mingliang Zeng, Xu Tan, Rui Wang, Zeqian Ju, Tao Qin, Tie-Yan Liu, ACL 2021.
  • [2] PDAugment: Data Augmentation by Pitch and Duration Adjustments for Automatic Lyrics Transcription, Chen Zhang, Jiaxing Yu, Luchin Chang, Xu Tan, Jiawei Chen, Tao Qin, Kejun Zhang, ISMIR 2022.
  • [3] DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling, Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu, ACL 2021.
  • [4] SongMASS: Automatic Song Writing with Pre-training and Alignment Constraint, Zhonghao Sheng, Kaitao Song, Xu Tan, Yi Ren, Wei Ye, Shikun Zhang, Tao Qin, AAAI 2021.
  • [5] TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method, Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiangyang Li, Tao Qin, Tie-Yan Liu, EMNLP 2022.
  • [6] ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships, Chen Zhang, LuChin Chang, Songruoyao Wu, Xu Tan, Tao Qin, Tie-Yan Liu, Kejun Zhang, ACM Multimedia 2022.
  • [7] Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation, Ang Lv, Xu Tan, Tao Qin, Tie-Yan Liu, Rui Yan, arXiv 2022.
  • [8] MeloForm: Generating Melody with Musical Form based on Expert Systems and Neural Networks, Peiling Lu, Xu Tan, Botao Yu, Tao Qin, Sheng Zhao, Tie-Yan Liu, ISMIR 2022.
  • [9] Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation, Botao Yu, Peiling Lu, Rui Wang, Wei Hu, Xu Tan, Wei Ye, Shikun Zhang, Tao Qin, Tie-Yan Liu, NeurIPS 2022.
  • [10] PopMAG: Pop Music Accompaniment Generation, Yi Ren, Jinzheng He, Xu Tan, Tao Qin, Zhou Zhao, Tie-Yan Liu, ACM Multimedia 2020.
  • [11] HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis, Jiawei Chen, Xu Tan, Jian Luan, Tao Qin, Tie-Yan Liu, arXiv 2020.
  • [12] CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval, Shangda Wu, Dingyao Yu, Xu Tan, Maosong Sun, ISMIR 2023, Best Student Paper Award.
  • [13] GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework, Ang Lv, Xu Tan, Peiling Lu, Wei Ye, Shikun Zhang, Jiang Bian, Rui Yan, arXiv 2023.
  • [14] MuseCoco: Generating Symbolic Music from Text, Peiling Lu, Xin Xu, Chenfei Kang, Botao Yu, Chengyi Xing, Xu Tan, Jiang Bian, arXiv 2023.
  • [15] MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models, Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian, EMNLP 2023 Demo.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.