THE USE OF HEURISTICS TO REDUCE NOISE IMPACT ON QUALITY OF RECOGNITION OF VOICE COMMANDS

Chi Thien Nguyen,
Ho Chi Minh City University of Technology (HCMUT), Ho Chi Minh City, Vietnam

DOI: 10.36724/2664-066X-2024-10-3-24-29

SYNCHROINFO JOURNAL. Volume 10, Number 3 (2024). P. 24-29.

Abstract

The problem of recognizing of voice commands in the background noise is discussed. Training speech signals are free of noise, but tested speech signals are usually noisy. The presence of noise leads to a strong deviation of spectra of tested speech signals from the spectra of their standards in the training set. Therefore, the quality of the recognition of voice commands in the background noise drops dramatically. If the spectrum of a noisy signal is very different from the spectrum of a clean signal, it is obvious that the degree of association of these spectra can be quite small. By increasing that degree the quality of recognition is expected to improve. To increase that degree the increase of the sample values of the amplitude spectra of the two signals by a constant is proposed. The results of experiments of recognition of voice commands in additive white Gaussian noise with the proposed increase are reported.

Keywords communication technologies, artificial Intelligence, Geospatial Industry, machine learning voice commands, speech recognition, additive white Gaussian noise, training speech signals, identification, speech signal model

References

[1] A. Abdildayeva, D. Zhyilyssova and G. Nazar, “Voice Recognition Methods and Modules for the Development of an Intelligent Virtual Consultant Integrated with WEB-ERP,” 2023 IEEE International Conference on Smart Information Systems and Technologies (SIST), Astana, Kazakhstan, 2023, pp. 468-473, doi: 10.1109/SIST58284.2023.10223552.

[2] Handbook of speech processing. Ed. by J. Benesty. Berlin: Springer, 2008. 1159 p.

[3] F. Zheng, G. Zhang, Z. Song, “Comparison of Different Implementations of MFCC,” Computer Science and Technology. 2001. Vol. 16. No. 6, pp. 582-589.

[4] P. Duhamel, M. Vetterli, “Fast Fourier Transforms: A Tutorial Review and a State of the Art,” Signal Processing. 1990. Vol. 19, pp. 259-299.

[5] L. R. Rabiner and R. W. Schafer, “Theory and Application of Digital Speech Processing,” Prentice-Hall Inc., 2010.

[6] DAFX: Digital Audio Effects. Ed. by U. Zolzer. West Sussex: John Willey & Sons, 2011. 602 p.

[7] A.V. Oppenheim, R.W. Schafer, “Discrete – Time Signal Processing,” New Jersey: Prentice Hall, 1999. 870 p.

[8] A.V. Davydov, “Signals and linear systems. Lectures,” Ekaterinburg, UGGU, 2005, 262 p.

[9] A.V. Agranovskiy, “Theoretical aspects of algorithms for processing and classifying speech signals,” Moscow: Radio i svyaz, 2004. 162 p.

[10] G. Leonard, G. Doddington, “TIDIGITS,” doi: 10.35111/72xz-6×59. https://catalog.ldc.upenn.edu/LDC93S10.

[11] K. Wojcicki, “Add noise to a signal at a prescribed SNR level,” dsplabs/matlab-addnoise (https://github.com/dsplabs/matlab-addnoise), GitHub, 2024.

[12] J.G. Proakis, M. Salehi, “Digital Communications,” New York: McGraw-Hill, 2008. 1150 p.