基于强化学习的智能黑白棋游戏设计毕业论文

2021-12-22 09:12

论文总字数：17417字

摘要

机器人工智能博弈(中文又称计算机人工智能博弈)目前被认为是机器和人工智能研究领域的一个重要的课题,涉及到了各种重要技术和科研的理论基础,例如在各类的国际金融市场决策管理系统,机器人工智能等,因此计算机博弈不仅是检验人工智能发展水平的重要指标之一，并且还是人工智能领域研究中最经典的问题之一。在机器博弈这一个研究领域，棋类游戏的机器博弈是检验机器博弈水平的重要手段。如今关于棋类机器博弈的研究成果已经应用到了人们生活的各个方面之中了，例如在金融决策和运动控制等。

黑白棋（Othello）于十九世纪末发明，鉴于它简单的游戏规则和逻辑控制，很适合作为本次研究的游戏载体，既能使研究着重于算法而不必耗费大量精力设计游戏逻辑细节，并且，由于黑白棋的棋盘仅有9X9大小，运算空间不及围棋等棋类游戏，可以大大节约算力。本次研究注重于算法的实现，基于黑白棋游戏为载体，本课题完成了三种智能算法的实现，其中最主要的算法实现就是深度学习网络与蒙特卡罗树搜索(Monte Carlo Tree Search，MCTS)算法相结合的智能AI算法。该算法的主要思想就是通过大量的蒙特卡罗抽样去计算相关的概率值，抽样越多，结果接近棋局的评估函数，进而选择较好的策略去指导落子。

关键词：黑白棋人工智能强化学习蒙特卡罗搜索树MCTS

Design of intelligent Reversi based on Reinforcement Learning

Abstract

Machine of artificial intelligence game (in Chinese is also called the computer artificial intelligence game) is now considered a machine and an important subject in the field of artificial intelligence research, involving a variety of important technical and scientific theoretical basis, for example, in all kinds of international financial market decision management system, such as artificial intelligence, machine so the computer game is not only a test of artificial intelligence development level of one of the important indicators, and is one of the most classic problems in research of artificial intelligence. In the research field of machine game, the machine game of board game is an important means to test the level of machine game. Nowadays, the research results of chess machine games have been applied to all aspects of people's life, such as financial decision-making and motion control.

Black and white (Othello) invented in the late 19th century, with its simple game rules and logic control, as a carrier for the study of the game is very suitable for both can make the research focuses on the algorithm and don't have to cost a lot of energy to the game logic design details, and, because of the black and white board only 9 x9 size, operation space than go games such as chess, can greatly save the work force. This study focuses on the implementation of algorithms. Based on black and white chess game as the carrier, this project has completed the implementation of three kinds of intelligent algorithms, among which the most important algorithm implementation is the intelligent AI algorithm combining deep learning network and Monte Carlo Tree Search (MCTS) algorithm. The main idea of this algorithm is to calculate the relevant probability value through a large number of monte carlo sampling. The more sampling, the result will be close to the evaluation function of the game, and then choose a better strategy to guide the placement.

Key words: Reversi; Artificial Intelligence; Reinforcement Learning; Monte Carlo Search Tree

摘要……………………………………………………………………………Ⅰ

Abstract…………………………………………………………………………Ⅱ

第一章绪论………………………………………………………………………1

1.1 课题背景及意义………………………………………………………………………1

1.2 国内外研究现状………………………………………………………………………1

1.3 课题研究内容…………………………………………………………………………2

1.4 论文结构………………………………………………………………………………3

第二章完备信息博弈……………………………………………………………4

2.1 引言……………………………………………………………………………………4

2.2 完备信息博弈以及黑白棋简介………………………………………………………4

2.2.1 黑白棋博弈规则………………………………………………………………4

2.2.2 黑白棋博弈复杂度……………………………………………………………5

2.2.3 博弈树…………………………………………………………………………5

2.3 传统机器博弈的方法…………………………………………………………………6

2.3.1 搜索策略………………………………………………………………………6

2.3.2 静态评估策略…………………………………………………………………6

2.3.3 动态评估策略…………………………………………………………………7

2.4 本章小结………………………………………………………………………………7

第三章基于强化学习的蒙特卡罗搜索树算法………………………………8

3.1 机器学习的相关工具…………………………………………………………………8

3.1.1 python…………………………………………………………………………8

3.1.2 keras和tensorflow…………………………………………………………8

3.1.3 人工神经网络…………………………………………………………………8

3.2 蒙特卡罗树搜索………………………………………………………………………9

3.2.1 基于UCT的蒙特卡罗树搜索…………………………………………………9

3.2.2 强化学习………………………………………………………………………10

3.2.3 深度学习网络与蒙特卡罗搜索树算法相结合………………………………11

3.3 三种智能算法的实现…………………………………………………………………12

3.3.1 基于随机策略的算法实现……………………………………………………12

3.3.2 基于贪心策略的算法实现……………………………………………………13

3.3.3 基于强化学习的算法实现……………………………………………………13

3.4 本章小结………………………………………………………………………………14

第四章黑白棋系统的设计与实现……………………………………………16

4.1 系统开发的层次结构设计……………………………………………………………16

4.1.1 项目的层次结构设计…………………………………………………………16

4.1.2 项目的文件组织结构…………………………………………………………16

4.1.3 应用运行的具体流程…………………………………………………………17

4.2 具体功能的设计与实现………………………………………………………………18

4.2.1 游戏逻辑的设计与实现………………………………………………………18

4.2.2 人工神经网络的设计与实现…………………………………………………22

4.2.3 蒙特卡罗树搜索的设计与实现………………………………………………23

4.3 本章小结………………………………………………………………………………26

第五章总结与展望……………………………………………………………27

5.1 本文主要工作…………………………………………………………………………27

5.2 进一步的展望…………………………………………………………………………27

参考文献…………………………………………………………………………29

第一章绪论

1.1课题背景及意义

随着互联网技术在这几年的快速发展，尤其是在计算机算力方面有了巨大提升，人工智能再一次成为科学研究中最热门的领域之一，例如在自然语言处理领域，可以更好地实现人机沟通互动，文本情感分析等；在计算机视觉领域，更好地帮助计算机去识别图片等信息；在医疗领域，智能诊断也得到了十分显著地效果。无疑，人工智能给人们的社会生活带来了巨大的价值。随着各项研究都取得了良好的成果，智能机器人领域也有了不错的发展，尤其是在棋类研究领域，人们研究的智能算法也已经取得了令人惊叹得成果。例如，上个世纪末，国际象棋得世界冠军卡斯巴罗夫就被IBM开发的超级计算机“深蓝”所击败；2016年3月，围棋世界冠军李世石被AlphaGo击败，而AlphaGo围棋程序使用的正是蒙特卡罗搜索树算法；2017年10月，AlphaGo团队介绍了自己最新的科研成果——AlphaGo Zero,它通过与自我对抗，在两个月内击败了之前所有版本的人工智能，其研究成果发表在了《自然》杂志上。

由于一般的棋类游戏在机器博弈的过程中会产生极大的搜索空间状态，因此我们很难使用一个具体的评估函数去评估棋局的状态进而去指导落子，正因如此，棋类游戏的机器博弈问题一直被认为是最具有挑战难度的经典问题。通过机器博弈，不但可以检验人工智能发展水平，也可以运用在其他人工智能领域，创造出实际的价值。从博弈论的角度来看，机器博弈实际上是一种马尔科夫决策过程，而几乎人类的所有行为都可以用这个模式来描述，包括军事和金融决策等。因此，通过在棋类游戏的机器博弈研究，可以更好地发展人工智能领域，从而产生更多的社会价值。

1.2国内外研究现状

由于机器博弈问题一直以来都是人工智能领域最重要的分支之一，因此在国内外每年都会举行和机器博弈问题相关的研讨会和项目锦标赛，例如每年的国际计算机博弈协会不仅仅会举办一年一度的学术研讨会，还会举办一系列相关的比赛项目。对于国内的发展现状而言，中国机器博弈专业委员会每年也会举办相关的竞赛项目，就目前而言，发展的棋类博弈项目已经超过了20个，其中这些就包括了围棋，象棋，国际象棋经典棋类游戏等。

请支付后下载全文，论文总字数：17417字

您需要先支付 80元 才能查看全部内容！立即支付

注册

找回密码

基于强化学习的智能黑白棋游戏设计毕业论文

Abstract

第一章绪论

1.1课题背景及意义

1.2国内外研究现状

您可能感兴趣的文章

最新文档

推荐栏目

登录

注册

找回密码

基于强化学习的智能黑白棋游戏设计毕业论文

Abstract

第一章 绪论

1.1课题背景及意义

1.2国内外研究现状

您可能感兴趣的文章

最新文档

推荐栏目

第一章绪论