登录

  • 登录
  • 忘记密码?点击找回

注册

  • 获取手机验证码 60
  • 注册

找回密码

  • 获取手机验证码60
  • 找回
毕业论文网 > 毕业论文 > 理工学类 > 数学与应用数学 > 正文

基于随机森林模型的收益率预测研究毕业论文

 2022-01-19 08:01  

论文总字数:16845字

摘 要

伴随股票市场的发展,对股市预测的研究也应运而生,为了规避投资风险,以获得最大投资收益,投资者需利用已有信息预测未来市场行情。传统预测方法例如线性分析,技术分析,神经网络算法,支持向量机分析以及聚类分析等等。因为股票价格变动缺乏线性特征,难以预测。同时股票影响因素众多,传统预测方法缺乏准确性或是计算量大,繁琐。而随机森林因其计算较为简易,抗拟合效果良好,对非线性系统也有良好的预测,因此尝试引入随机森林算法进行分析。本文利用经典机器学习方法——随机森林算法,以上海证券100只股票的相关数据位研究样本,建立相应的指标体系,并用相关性分析进行指标初步筛选,对股票数据进行分析优化,然后选取树节点预选的变量个数mtry,随机森林中树的个数ntree值,并进行重要性测试对指标进一步筛选从而提升预测精准度,以一,二月份的数据作为训练集对三月份股市涨跌进行测试,得到了较好的预测结果,预测的准确度达到了93.45%,最后,将随机森林算法与基于回归的预测模型——logistic算法进行比较,对比结果表明随机森林预测具备更好的预测准确度。

关键词:随机森林;股市 ;预测; 指标体系

Abstract

As the stock market developed, research on stock market forecasts has emerged. In order to avoid investment risks and obtain maximum investment income, investors need to use existing information to predict future market conditions.Traditional prediction methods such as linear analysis, technical analysis, neural network algorithms, support vector machine analysis, and cluster analysis.Because stock price movements lack linear features, it is difficult to predict. Of course, there are also many factors. The traditional forecasting methods lack accuracy or are computationally intensive and cumbersome.However, random forests are simple to calculate, have good anti-fitting effects, and have good predictions for nonlinear systems. Therefore, we try to introduce random forest algorithm for analysis.

This article using the classical machine learning method-random forest algorithm, the relevant data bits of 100 stocks of Shanghai Securities are used to study the samples and establish corresponding index system.The correlation analysis is used to conduct preliminary screening of indicators, and the stock data is analyzed and optimized. Then, the number of variables pre-selected by the tree node is selected, the number of trees in the random forest is ntree, and the importance test is used to further screen the indicators to improve the prediction accuracy. degree,Using the data of January and February 2017 as a training set to test the stock market's rise and fall in March, and get a good forecast result, the accuracy of the forecast reached 93.45%.Finally, the test results are obtained and compared with the logistic algorithm. The comparison result display that random forests have clear advantages in predicting accuracy.

Keywords: random forest; stock market; forecast; indicator system

目录

第一章 前言 1

1.1 研究背景 1

1.2研究方法 1

1.3 研究现状 1

1.4 传统研究方法的局限 2

第二章 随机森林算法介绍 4

2.1 决策树 4

2.2 随机森林 4

2.2.1 随机森林介绍 5

2.2.2 随机森林算法的步骤 7

2.2.3 随机森林算法的优点 8

第三章 随机森林实证分析 9

3.1 数据选取与优化 9

3.1.1 指标体系建立 9

3.1.2 相关性实证研究 10

3.1.3 数据优化处理 10

3.1.4 确定最优的mtry值 11

3.1.5 确定最优的ntree值 11

3.2 重要性测试与筛选 12

3.3 进行预测 13

3.3.1 选取最优mtry值 14

3.3.2 确定最优ntree值 14

3.3.3 训练集测试 15

3.3.4 测试集测试 16

第四章 与logistic算法比较 16

4.1 建模分析 17

4.2 显著性检验 17

4.3 预测结果并进行对比 18

第五章 总结与展望 19

参考文献 20

附件 21

致谢 23

第一章 前言

1.1研究背景

请支付后下载全文,论文总字数:16845字

您需要先支付 80元 才能查看全部内容!立即支付

企业微信

Copyright © 2010-2022 毕业论文网 站点地图