Mel Spectrogram
Introduction Mel spectrogram is an audio analyzing technique which is predominantly applied to raw audio form as a preprocessing step before passing to any model for predictions. For example, speech-to-text models’ input raw audio is converted into mel spectrogram before passing to the model. In general, mel spectrogram is a kind of visualization technique which takes into account how the human ear perceives audio frequencies during this low dimensional conversion process. Compared to the raw audio waveform and the more natural way humans perceive audio, these are all the predominant reasons we prefer audio in mel spectrogram form. ...