Thursday, November 26, 2020

INTRODUCTION

 


Downloading and purchasing music from online music collections has become a part of the daily life of probably a large number of people in the world. The users often formulate their preferences in terms of genre, such as hip hop or pop or disco. However, most of the tracks now available are not automatically classified to a genre. Given a huge size of existing collections, automatic genre classification is important for organization, search, retrieval, and recommendation of music. 

Nowadays many music applications such as Spotify,ganna,saavan provides these feature of sorting the songs according to there genres.

                                                                     Feature of Spotify        

Music Genre classification is very important in today’s world due to rapid growth in music tracks, both online and offline. In order to have better access to these we need to index them accordingly. Automatic music genre classification is important to obtain music from a large collection. Most of the current music genre classification techniques uses machine learning techniques.

In this blog we are going to propose and develop Deep learning technique to Classify the music into its genre. We are going to use convolutional neural networks for the feature extraction and training of music on an open source dataset of music genres -GTZAN

So stay tune for the Upcoming post on the detail explaination of dataset and classification  techniques. Also Like,share and comment if you find this blog helpfull.

The Mel Spectrogram - Input to the algorithm

There are two major challenges in classifying music genres:

  1. Musical genres are loosely defined. So that people often argue over the genre of a song.
  2. It is a nontrivial task to extract differentiating features from audio data that could be fed into a model.

The first problem we have no control over. This is the nature of musical genres, and something that will be a limitation. The second problem has been heavily researched in the field of Music Information Retrieval (MIR), which is dedicated to the task of extracting useful information from audio signals.

In order to build a model that could classify a song by its genre, it is very important to find good features. An interesting feature that kept coming up was the mel spectrogram.

The Mel Spectrogram

The mel spectrogram can be thought of as a visual representation of an audio signal. Specifically, it represents how the spectrum of frequencies vary over time.

The Fourier transform is a mathematical formula that allows us to convert an audio signal into the frequency domain. It gives the amplitude at each frequency, and we call this the spectrum. Since frequency content typically varies over time, we perform the Fourier transform on overlapping windowed segments of the signal to get a visual of the spectrum of frequencies over time. This is called the spectrogram. Finally, since humans do not perceive frequency on a linear scale, we map the frequencies to the mel scale (a measure of pitch), so that equal distances in pitch sound equally distant to the human ear. What we get is the mel spectrogram.


Code for generating Mel Spectrogram:

We now have a way to visually represent a song. Let’s take a look at some mel spectrograms from songs of different genres.

What we have essentially done is turned the problem into an image classification task. This is great because there’s a model that was made specifically for this task: the convolutional neural network (CNN).

Distribution of Dataset

 

In this blog post I would be discussing about Dataset...

GTZAN Genre Collection dataset was used to perform the classification. The dataset has been taken from the popular software framework MARSYAS. Marsyas (Music Analysis, Retrieval and Synthesis for Audio Signals) is an open source software framework for audio processing with specific emphasis on Music Information Retrieval applications. Marsyas has been used for a variety of projects in both academia and industry.

We have compared several open music Dataset with associated metadata and select GTZAN Genre Collection, of which contains 1000 audio tracks each 30 seconds long. 

There are 10 genres represented, each containing 100 tracks. All the tracks are 22050 Hz Mono 16 bit audio files in .au format. The 10 music genre includes: classical, jazz, metal, pop, country, blues, disco, metal, rock, reggae and hip-hop.

The audio files are divided into 2 sec long audio chunks and labels are maintained accordingly.We finally had 14000 audio samples equally distributed over 10 genres. This was split these samples randomly into a 80-20 train-validation ratio, giving us 11200 training samples and 2800 validation samples.

 

Table 1. Distribution of the Dataset



 

 

 

 

 

 

 

 

 

Step to Run to Create Music Genre Classification

 

Here , Music Genre Classification using Convolutional Neural Networks is performed by involving high-level features such as Spectrogram Feature and Chroma Feature. Python programming language will be used for several steps of works from dataset collection, segmentation, feature extraction, until classification.

In order to create Music Genre Classification program, we have to run several steps as follow:

1)Music Dataset Collection

There are several Music Data provider such as GTZAN dataset or other sources. GTZAN is an open sourse software framework. This open source dataset helps lot. for better accuracy we need Separate dataset for Training ,validation and testing. so while Training we need some validation dataset , so after training and validation , testing done and it gives accuracy as per our dataset.

2) Extract features from music data

as we know that Librosa is one of the famous library, so Here we  used one a popular music or speech library called Librosa. This library is powerful because there are many functions included, such as feature extraction, as we mentioned above we are extracting Spectrogram feature so Librosa is mostly useful to extract Spectrogram features. Librosa library has all inbuilt functions to extract feature from music data.

Here is some code for preparing music files, means extracting features from music data:

x_, fs = librosa.load(music, duration=120) #load from music data. set duration to 120 second
s = librosa.feature.melspectrogram(x_, fs) #change into melspectogram
x_train, y_train, x_val, y_val, x_test, y_test = readDataset(location) #load from your dataset
x_train = x_train.reshape(x_train.shape[0], 128, 5168, 1) #reshape x_train into: (num of data, 120,180,20)
x_val = x_val.reshape(x_val.shape[0], 128, 5168, 1) #reshape x_train into: (num of data, 120,180,20)
x_train = x_train.astype('float32') / 255. #normalization of x_train
x_val = x_val.astype('float32') / 255. #normalization of x_train		
y_train = np_utils.to_categorical(y_train, 3)
y_val = np_utils.to_categorical(y_val, 3)

3)Train Model

As we know that Convolutional Neural Network is good for the training model, so to train Spectrogram feature convolutional neural network is using because it is very efficient way to train model .The architecture of CNN can be seen below:



We can see from the architecture above, there are several layer consist in CNN like Input layer, Convolutional Layer, Subsampling/Pooling Layer, Fully Connected Layer and so on. As Feed forward neural network don't have edge detection feature so CNN has this feature and it suited for image classification problems, but problem is that CNN is cost effective but it gives better accuracy or CNN model did much better , sure it will gives better accuracy as compare to other classification. we tried to use many classifications instead of CNN but we understand that CNN is better in improving efficiency after training on data

In this type we can run steps for music genre classification

 

 

 

Wednesday, November 25, 2020

Applications Of Music Genre Classification Using Deep Learning!

 Hey guys! In this blog post I would be discussing how 'Deep Learning' innovations are having an impact on major music streaming services across the world.

We all love to hear songs don't we? Today music streaming companies are using NLP,Deep Learning techniques,Collaborative filtering etc to provide their users with rich experience.Today spotify has become the top music streaming platform with millions of songs in its database.Everyday thousands of songs are released on platforms like Spotify,JioSaavn,Ganna etc.

As the quantity of music being released continues to skyrocket, a need arises to accurately classify the metadata of the songs in the database which will prove useful for storage and retrieval purposes taking the Scalibilty of the application under consideration.Here is where Deep Learning shines,it helps in classifying the songs in a library or playlist by their genre.Clearly this benefits the streaming services as they can get insights into which music genre is preferred by their Users.

Spotify used intelligent deep learning techniques to create its famous Discover Weekly-a  weekly music recommendation service which suggests its users a song depending on their playlist and preferences.'Discover Weekly' had a positive impact and was well received by the users.

     These Music streaming services are just getting started and are using sophisticated Deep Learning techniques to improve their offerings.With AI the possibilities are endless and with techniques like Voice assistant,Song mixing these platforms would help to increase the user experience even more!

INTRODUCTION

  Downloading and purchasing music from online music collections has become a part of the daily life of probably a large number of people in...