Here , Music
Genre Classification using Convolutional Neural Networks is performed by
involving high-level features such as Spectrogram Feature and Chroma Feature. Python
programming language will be used for several steps of works from dataset collection,
segmentation, feature extraction, until classification.
In
order to create Music Genre Classification program, we have to run several
steps as follow:
1)Music
Dataset Collection
There
are several Music Data provider such as GTZAN dataset or other sources. GTZAN is an open sourse software framework. This open source dataset helps lot. for better accuracy we need Separate dataset for Training ,validation and testing. so while Training we need some validation dataset , so after training and validation , testing done and it gives accuracy as per our dataset.
2) Extract
features from music data
as
we know that Librosa is one of the famous library, so Here we used one a
popular music or speech library called Librosa. This library is powerful
because there are many functions included, such as feature extraction, as we
mentioned above we are extracting Spectrogram feature so Librosa is
mostly useful to extract Spectrogram features. Librosa library has all inbuilt functions to extract feature from music data.
Here is some code for preparing music files, means extracting features from music data:
x_, fs = librosa.load(music, duration=120) #load from music data. set duration to 120 second
s = librosa.feature.melspectrogram(x_, fs) #change into melspectogram
x_train, y_train, x_val, y_val, x_test, y_test = readDataset(location) #load from your dataset
x_train = x_train.reshape(x_train.shape[0], 128, 5168, 1) #reshape x_train into: (num of data, 120,180,20)
x_val = x_val.reshape(x_val.shape[0], 128, 5168, 1) #reshape x_train into: (num of data, 120,180,20)
x_train = x_train.astype('float32') / 255. #normalization of x_train
x_val = x_val.astype('float32') / 255. #normalization of x_train
y_train = np_utils.to_categorical(y_train, 3)
y_val = np_utils.to_categorical(y_val, 3)
3)Train
Model
As
we know that Convolutional Neural Network is good for the training model, so to
train Spectrogram feature convolutional neural network is using because it is
very efficient way to train model .The architecture of CNN can be seen below:
We can see from the
architecture above, there are several layer consist in CNN like Input layer,
Convolutional Layer, Subsampling/Pooling Layer, Fully Connected Layer and so
on. As Feed forward neural network don't have edge detection feature so CNN has this feature and it suited for image classification problems, but problem is that CNN is cost effective but it gives better accuracy or CNN model did much better , sure it will gives better accuracy as compare to other classification. we tried to use many classifications instead of CNN but we understand that CNN is better in improving efficiency after training on data
In this type we can run steps
for music genre classification