Technical Approach

Shruti, Pranav
Dec 17, 2017
2 min read

The methods used to achieve music generation using Restricted Boltzmann Machine Algorithm.

1. Generative Models

It is a model that allows us to learn simulator of data. They are the models that allow for (conditional) density estimation. It is an approach for unsupervised learning of data.

Generative models are used in machine learning for either modeling data directly (i.e., modeling observations drawn from a probability density function), or as an intermediate step to forming a conditional probability density function. Generative models are typically probabilistic, specifying a joint probability distribution over observation and target (label) values. A conditional distribution can be formed from a generative model through Bayes' rule.

Generative Models specify a probability distribution over a dataset of input vectors. For an unsupervised task, we form a model for P(x)P(x), where xx is an input vector. For a supervised task, we form a model for P(x|y)P(x|y), where yy is the label for xx. Like discriminative models, most generative models can be used for classification tasks. To perform classification with a generative model, we leverage the fact that if we know P(X|Y)P(X|Y) and P(Y)P(Y), we can use bayes rule to estimate P(Y|X)P(Y|X).

Since, we are going to use a generative model, we can generate new samples directly by sampling from the modelled probability distribution.

Thinking about Generating new music files, we thought of trying Generative Adversarial Networks as a part of music generation process. We started working on GAN with our dataset.

1. Sound from recurrent neural networks

Audio generation seems like a natural application of recurrent neural networks, which are currently very popular (and effective) models for sequence-to-sequence learning. They have been used in speech recognition and speech synthesis

1. Restricted Boltzman Machine(RBM)

A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs.

Architecture

The RBM is a neural network with 2 layers, the visible layer and the hidden layer. Each visible node is connected to each hidden node (and vice versa), but there are no visible-visible or hidden-hidden connections (the RBM is a complete bipartite graph). Since there are only 2 layers, we can fully describe a trained RBM with 3 parameters:

The weight matrix WW:

WW has size nvisible x nhiddennvisible x nhidden. WijWij is the weight of the connection between visible node ii and hidden node jj.

The bias vector bvbv:

bvbv is a vector with nvisiblenvisible elements. Element ii is the bias for the ithith visible node.

The bias vector bh:

bhbh is a vector with n_hidden elements. Element jj is the bias for the jthjth hidden node.

nvisible nvisible is the number of features in the input vectors. Nhidden nhidden is the size of the hidden layer. Each visible node takes one chord. Each chord is multiplied by a weight and then the node’s output at the hidden layer. Unlike most of the neural networks, RBMs are generative models that directly model the probability distribution of data.

Technical Approach

Recent Posts

1 Comment

COntact us