基于Attention机制的深度图像分类算法研究与实现毕业论文

2021-11-09 21:41:03

摘要

为了提高图像识别图像分类的正确率，一般会增加网络的深度或者宽度。但是当网络的宽度，深度达到一定程度时会出现梯度消失，梯度爆炸等问题，使图像识别图像分类的正确率难以提高。本研究将注意力机制与传统神经网络相结合来提高图像分类的正确率，具体包括将注意力机制与已有的残差神经网络的残差学习思想结合起来构建注意力模块，进而将注意力模块进行堆叠构建残差注意力网络。新构建的残差注意力网络可以和任意已有网络相结合使用，实用性强。根据模型的层数本研究中构建了attention-56和attention-92两种残差注意力网络，同时这两种神经网络都设计了针对cifar10数据集和ImageNet数据集的两种类型。本研究在cifar10数据集上训练和验证了attention-92网络，发现相比于传统神经网络残差注意力网络可以使用更少的模型参数达到更高的图像分类正确率。最后将mixup数据增强方法应用于网络，图像分类正确率再次提高。

关键词：注意力机制；图像分类；残差网络；mixup数据增强

Abstract

In order to improve the accuracy of image recognition and image classification, the depth or width of network will be increased. However, when the width and depth of the network reach a certain degree, the gradient disappears and the gradient explodes, which makes it difficult to improve the accuracy of image classification. In this study, attention mechanism and traditional neural network are combined to improve the accuracy of image classification. Specifically, attention mechanism and residual learning idea of residual neural network are combined to build attention module, and then the attention module is stacked to build residual attention network. The newly constructed residual attention network can be used in combination with any existing network, which has strong practicability. According to the number of layers of the model, two kinds of residual attention networks, attention-56 and attention-92, have been constructed in this study. At the same time, these two kinds of neural networks have been designed for two types of cifar10 dataset and ImageNet dataset. In this study, the attention-92 network is trained and verified on the cifar10 dataset. It is found that compared with the traditional neural network, the residual attention network can use fewer model parameters to achieve higher image classification accuracy. Finally, the mixup data enhancement method is applied to the network, and the accuracy of image classification is improved again.

Key Words：attention mechanism；image classification；residual network；mixup data enhancement

第1章绪论 1

1.1 研究背景及意义 1

1.2 国内外研究现状 2

1.3 主要研究内容与目标 3

1.4 论文组织结构 3

第2章构建模型的基础单元 5

2.1构建神经网络常用的基础单元 5

2.1.1卷积层 5

2.1.2池化层 5

2.1.3全连接层 5

2.1.4 Batch Normalization层 6

2.1.5 线性插值层 6

2.2 residual unit（残差单元） 7

2.3 mixup数据增强 8

2.4 本章小结 8

第3章模型的设计与构建 9

3.1残差注意力网络attention module结构图 9

3.1.1软掩码分支结构说明 9

3.1.2软掩码分支attention类型的选择 9

3.1.3主干分支结构说明 10

3.1.4软掩码分支起作用的方式 10

3.2残差注意力网络结构图 11

3.2.1各个stage自下而上和自上而下部分详细结构图 11

3.3残差注意力网络结构细节 13

3.3.1四种残差注意力网络结构细节 13

3.4 本章小结 15

第4章模型的训练与结果分析 17

4.1模型的训练 17

4.1.1 数据集 17

4.1.2 预期训练过程 17

4.2结果分析 17

4.2.1 代码编写和训练环境 17

4.2.2 实际训练过程 18

4.2.3 实验结果 18

4.2.4 结果分析 19

4.3 本章小结 20

第5章总结与展望 21

参考文献 22

致谢 24

第1章绪论

1.1 研究背景及意义

随着时代进步和发展，计算机视觉的应用越来越广泛，为我们的生活带来了便利，让我们的生活更加高效。图像分类作为计算机视觉的一种核心技术，从几十年前就已经有相关算法和神经网络被提出，如果能够提高图像分类的效率和准确率，图像识别，目标检测，自动驾驶等其他相关技术的发展也会得到巨大的进步，将会对计算机视觉的发展起到推波助澜的作用，为我们的生活带来更大的便利。Attention机制是一种可以与已有神经网络相结合的方法，当将它与已有网络结合在一起时可以提高图像分类的速度和正确率，为我们构建更好的深度图像分类网络提供了思路，值得我们和探索。

Attention是一种用于提升基于RNN递归神经网络^[1]或者RNN^[1]的进阶版gru^[2]，LSTM^[3]的Encoder编码器＋Decoder解码器模型效果的机制，常称为Attention Mechanism即注意力机制。注意力机制目前非常流行，广泛应用于自然语言处理，计算机视觉的各种领域，如图像识别，图像分类，机器翻译等。之所以它这么受欢迎，是因为注意力机制与人类自身观察外界事物的方法很类似，当人类观察外界事物的时候，往往倾向于根据自身的特征和需要去有选择的获取被观察事物的某些重要部分或者特征，而不会仔细观察注意那个事物的每一个点。比如我们在做英语六级阅读理解题目，我们都不会先把文章全部读一遍记下每个细节再去做题目，而是先读题目，得到我们解这个题目需要的关键点，再带着这些关键点去有选择阅读文章，这样更加高效。因此，使用了attention机制的模型在提取输入input特征时会对input的不同的部分按照重要性赋予不同的权重，以抽取出更加关键及重要的信息，使模型得到的结果更加准确，且attention不会增加模型在计算和存储上的花销，这也是Attention机制目前如此流行的原因之一。Attention给模型赋予了找出模型输入当中较为重要部分和获取关键信息的能力，进而避免了模型在处理大量或者高维输入数据时出现计算能力不足，内存不够等问题，通过选取输入当中较为重要的数据，可以降低数据维度，减少需要处理的数据量。

您需要先支付 80元 才能查看全部内容！立即支付

注册

找回密码