Compact and Robust MFCC-based Space-Saving Audio Fingerprint Extraction for Efficient Music Identification on FM Broadcast Monitoring

Authors

  • Myo Thet Htun Faculty of Computer Systems and Technologies, University of Computer Studies, No. (4) Main Road, Yangon, 11411, Myanmar

DOI:

https://doi.org/10.5614/itbj.ict.res.appl.2022.16.3.3

Keywords:

audio fingerprinting, FM broadcast monitoring, MFCC features extraction, music identification, Philips Robust Hashing, space-saving audio fingerprints

Abstract

The Myanmar music industry urgently needs an efficient broadcast monitoring system to solve copyright infringement issues and illegal benefit-sharing between artists and broadcasting stations. In this paper, a broadcast monitoring system is proposed for Myanmar FM radio stations by utilizing space-saving audio fingerprint extraction based on the Mel Frequency Cepstral Coefficient (MFCC). This study focused on reducing the memory requirement for fingerprint storage while preserving the robustness of the audio fingerprints to common distortions such as compression, noise addition, etc. In this system, a three-second audio clip is represented by a 2,712-bit fingerprint block. This significantly reduces the memory requirement when compared to Philips Robust Hashing (PRH), one of the dominant audio fingerprinting methods, where a three-second audio clip is represented by an 8,192-bit fingerprint block. The proposed system is easy to implement and achieves correct and speedy music identification even on noisy and distorted broadcast audio streams. In this research work, we deployed an audio fingerprint database of 7,094 songs and broadcast audio streams of four local FM channels in Myanmar to evaluate the performance of the proposed system. The experimental results showed that the system achieved reliable performance.

Downloads

Download data is not yet available.

References

Baek, M., Design and Performance Evaluation of Advanced Digital Audio Broadcasting System, 7th International Conference on Multimedia, Computer Graphics and Broadcasting, 2015.

Cano, P., Robust Sound Modeling for Song Detection in Broadcast Audio, In Proceeding of the 112th AES Convention, pp. 1-7, 2002.

Jang, D., Automatic Commercial Monitoring for TV Broadcasting, AES 29th International Conference on Audio for Mobile and Handheld Devices, Seoul, Republic of Korea, 2006.

Neves, C., Audio Fingerprinting System for Broadcast Streams, In Proceeding of the Conference on Telecommunications, Santa Maria da Feira, Portugal, 1, pp. 481-484, 2009.

Pikrakis, A., Giannakopoulos, T. & Theodoridis, S., Speech Music Discrimination for Radio Broadcasts using a Hybrid HMM-Bayesian Network Architecture, 14th European Signal Processing Conference, 2006.

Ramona, M., Fenet, S., Blouet, R., Bredin, H., Fillon, T. & Peeters, G., A Public Audio Identification Evaluation Framework for Broadcast Monitoring, Applied Artificial Intelligence: Special Issue in Event Recognition, pp. 119-136, 2012.

Xu, H. & Ou, Z., Scalable Discovery of Audio Fingerprint Motifs in Broadcast Streams with Determinantal Point Process based Motif Clustering, IEEE/ACM Transactions on Audio, Speech, and Language Processing Journal, 24, pp. 978-989, 2016.

Haitsma, J. & Kalker, T., A Highly Robust Audio Fingerprinting System, International Symposium for Music Information Retrieval, 2002.

Htun, M. T., Analytical Approach to MFCC based Space-Saving Audio Fingerprinting System, 17th International Conference on Computer Applications, Yangon, Myanmar, 2019.

Htun, M. T. & Oo, T. T., Compact and Robust Audio Fingerprinting for Speedy Music Identification, 11th International Conference on Future Computer and Communications, Yangon, Myanmar, 2019.

Park, M., Kim, H. & Yang, S. H., Frequency-Temporal Filtering for a Robust Audio Fingerprinting Scheme in Real-Noise Environments, Journal of Electronics and Telecommunications Research Institute, 28(4), pp. 509-512, 2006.

Yao, S., Niu, B. & Liu, J., A Sampling and Counting Method for Big Audio Retrieval, IEEE Second Intl. Conf. on Multimedia Big Data, Taipei, Taiwan, 2016.

Gilbers, S. et al., Regional Variation in West and East Coast African American English Prosody and Rap Flows, Language and Speech Journal, 63, pp. 713-745, 2020.

Gupta, A., Rahman, A. & Yasmin, G., Audio Fingerprinting using High-Level Feature Extraction, in Proceeding of the Computational Intelligence in Pattern Recognition, Kolkata, India, pp. 281-291, 2021.

Kopparapu, S. K. & Laxminarayana, M., Choice of Mel Filter Bank in Computing MFCC of a Resampled Speech, 10th International Conference on Information Sciences, Signal Processing and their Applications, Kuala Lumpur, Malaysia, 2010.

Leu, F.Y. & Lin, G.L., An MFCC-based Speaker Identification System, IEEE 31st International Conference on Advanced Information Networking and Applications, Taipei, Taiwan, 2017.

Downloads

Published

2022-12-27

How to Cite

Myo Thet Htun. (2022). Compact and Robust MFCC-based Space-Saving Audio Fingerprint Extraction for Efficient Music Identification on FM Broadcast Monitoring. Journal of ICT Research and Applications, 16(3), 226-242. https://doi.org/10.5614/itbj.ict.res.appl.2022.16.3.3