<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://beta.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Contribution of Common Modulation Spectral Features to Vocal-Emotion Recognition of Noise-Vocoded Speech in Noisy Reverberant Environments

descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 04 Oct 2022Publisher:MDPI AGJournal:Applied Sciences, volume 12, page 9,979 (eissn: 2076-3417,

Authors: Taiyang Guo; Zhi Zhu; Shunsuke Kidani; Masashi Unoki;

doi: 10.3390/app12199979

Contribution of Common Modulation Spectral Features to Vocal-Emotion Recognition of Noise-Vocoded Speech in Noisy Reverberant Environments

- Summary
- Subjects
- Metrics

Abstract

In one study on vocal emotion recognition using noise-vocoded speech (NVS), the high similarities between modulation spectral features (MSFs) and the results of vocal-emotion-recognition experiments indicated that MSFs contribute to vocal emotion recognition in a clean environment (with no noise and no reverberation). Other studies also clarified that vocal emotion recognition using NVS is not affected by noisy reverberant environments (signal-to-noise ratio is greater than 10 dB and reverberation time is less than 1.0 s). However, the contribution of MSFs to vocal emotion recognition in noisy reverberant environments is still unclear. We aimed to clarify whether MSFs can be used to explain the vocal-emotion-recognition results in noisy reverberant environments. We analyzed the results of vocal-emotion-recognition experiments and used an auditory-based modulation filterbank to calculate the modulation spectrograms of NVS. We then extracted ten MSFs as higher-order statistics of modulation spectrograms. As shown from the relationship between MSFs and vocal-emotion-recognition results, except for extremely high noisy reverberant environments, there were high similarities between MSFs and the vocal emotion recognition results in noisy reverberant environments, which indicates that MSFs can be used to explain such results in noisy reverberant environments. We also found that there are two common MSFs (MSKTk (modulation spectral kurtosis) and MSTLk (modulation spectral tilt)) that contribute to vocal emotion recognition in all daily environments.

Related Organizations

Keywords

Technology, QH301-705.5, T, Physics, QC1-999, noisy reverberant environment, Engineering (General). Civil engineering (General), vocal emotion recognition, Chemistry, noise-vocoded speech, TA1-2040, Biology (General), modulation spectral feature; vocal emotion recognition; noise-vocoded speech; noisy reverberant environment, QD1-999, modulation spectral feature

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Top 10%

Average

gold

Fields of Science (3) View all

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

View all

Related to Research communities

Energy Research