基于柏林语音情感库的语音情感识别的研究

2022-11-26 12:56:45

论文总字数：15982字

摘要

人类情感交流研究现在越来越热门，语音情感识别是当今热门研究领域，交流是人与人之间沟通传递信息最普遍的方式。语音信号不仅传递了信息，也传递了对方的情感状态。语音情感识别技术通过与说话人交流，从其语音中识别出对方的情感状态。语音情感识别应用在各个领域，在医疗卫生领域，通过与病人交流，判断情感状态协助医生判断心理疾病；在刑侦领域，嫌疑人的情感可以作为一大助力，加快案件的破解过程；又如载人航天领域，通过对航天员的情感状态进行监控，可以实时了解航天员在太空中的状态，推动航天事业不断进步。

语音情感识别主要包括情感语音对梅尔频率倒谱系数(Mel-Frequency Cepstral Coefficients, MFCC)特征提取、分帧加窗预加重、支持向量机(Support Vector Machine, SVM)模型训练和语音情感识别这四个关键技术。本设计将主要解决针对单一数据库提高情感识别率的问题，并基于柏林语音情感库完成实验。

关键词：语音情感识别；语音特征提取；语音特征降维；MFCC；预处理；支持向量机

Research on Speech Emotion Recognition based on Berlin Speech Emotion Database

Abstract

The research of human emotion communication is becoming more and more popular. Speech emotion recognition is a popular research field. Communication is the most common way to communicate and transfer information between people. Voice signals not only transmit information, but also convey the emotional state of the other party. Speech emotion recognition technology can recognize the emotional state of the speaker from his speech by communicating with him. Speech emotion recognition is applied in various fields. In the field of medical and health care, it helps doctors to judge mental illness by communicating with patients and judging emotional state; In the field of criminal investigation, the suspect's emotion can be used as a great help to speed up the cracking process of the case; Another example is in the field of manned space. By monitoring the emotional state of astronauts, we can know the status of astronauts in space in real time and promote the continuous progress of the aerospace industry.

Speech emotion recognition includes four key techniques: Mel-Frequency Cepstral Coefficients (MFCC) feature extraction, framing window pre-emphasis, support vector machine model training, and speech emotion recognition. This design will mainly solve the problem of improving emotional recognition rate for a single database, and complete the experiment based on the Berlin Speech Emotion Library.

Keywords: Speech emotion recognition; Speech feature extraction; Voice feature degradation; MFCC; Pretreatment; Support vector machine

目录

摘要 I

Abstract II

第一章引言 1

1.1 研究背景及应用领域 1

1.2 国内外研究现状以及存在问题 1

1.3 论文的研究内容及结构 2

1.4 本章小结 2

第二章情感分类及语音特征分析 3

2.1 情感的分类 3

2.2 语音情感数据集 4

2.3 语音情感特征分析 4

2.3.1 时间构造 4

2.3.2 振幅构造 5

2.3.3 基频构造 5

2.4 本章小结 5

第三章语音信号预处理及情感特征提取方法 6

3.1 语音预处理 6

3.2 分帧及加窗 7

3.3 端点检测 8

3.4 MFCC分析 8

3.5 本章小结 10

第四章基于SVM的语音情感识别 11

4.1 支持向量机概述 11

4.2 核函数的选择 11

4.3 实验结果 12

4.4 本章小结 13

第五章总结与展望 14

5.1 总结 14

5.2 展望 14

致谢 15

参考文献（References） 16

附录 17

第一章引言

1.1 研究背景及应用领域

人们的生活方式因为计算机迅速的发展改变颇深，计算机现在已经可以和人类一样去交流、思考和决策，并且可以帮助人类做许多不能完成的工作。因此，为了使计算机与人类交流更加方便、智能，人机交互技术成为了当今研究的热门。在将来，人类将会使用计算机完成更多目前做不到的技术。

语言作为人们之间交流最快最有效的方式，在人机交互领域作为重要的研究项目。当今，众多企业和研究人员对语音情感识别领域深入研究，获得了不错的进展，对人类语音情感识别率大大提高，从而转化为文本，最后在系统中通过语音合成发出声音。然而，真正的人机交互还远远不够，人类通过交流表现出的复杂的情感状态目前还难以准确的判断。因此，这个领域吸引了更多的人关注，并深入了解。

能够达到人机交互是语音情感识别最终达到的目的，不仅人机交互需要语音情感识别的研究，人们生活中的方方面面都有语音情感识别的功劳，应用如下：

剩余内容已隐藏，请支付后下载全文，论文总字数：15982字

您需要先支付 80元 才能查看全部内容！立即支付

注册

找回密码