Thursday, November 26, 2020

The Mel Spectrogram - Input to the algorithm

There are two major challenges in classifying music genres:

  1. Musical genres are loosely defined. So that people often argue over the genre of a song.
  2. It is a nontrivial task to extract differentiating features from audio data that could be fed into a model.

The first problem we have no control over. This is the nature of musical genres, and something that will be a limitation. The second problem has been heavily researched in the field of Music Information Retrieval (MIR), which is dedicated to the task of extracting useful information from audio signals.

In order to build a model that could classify a song by its genre, it is very important to find good features. An interesting feature that kept coming up was the mel spectrogram.

The Mel Spectrogram

The mel spectrogram can be thought of as a visual representation of an audio signal. Specifically, it represents how the spectrum of frequencies vary over time.

The Fourier transform is a mathematical formula that allows us to convert an audio signal into the frequency domain. It gives the amplitude at each frequency, and we call this the spectrum. Since frequency content typically varies over time, we perform the Fourier transform on overlapping windowed segments of the signal to get a visual of the spectrum of frequencies over time. This is called the spectrogram. Finally, since humans do not perceive frequency on a linear scale, we map the frequencies to the mel scale (a measure of pitch), so that equal distances in pitch sound equally distant to the human ear. What we get is the mel spectrogram.


Code for generating Mel Spectrogram:

We now have a way to visually represent a song. Let’s take a look at some mel spectrograms from songs of different genres.

What we have essentially done is turned the problem into an image classification task. This is great because there’s a model that was made specifically for this task: the convolutional neural network (CNN).

No comments:

Post a Comment

INTRODUCTION

  Downloading and purchasing music from online music collections has become a part of the daily life of probably a large number of people in...