基于ADP的单杠体操机器人摆动和平衡控制毕业论文

2021-03-19 09:03

摘要

单杠体操机器人是一种复杂的欠驱动系统，是一种具有非驱动关节的机器人。欠驱动系统是指系统的输入变量数小于系统自由度数的一类非线性系统。而本文的单杠体操机器人正是这种类型。因此，本文中的单杠体操机器人的摆起和平衡控制具有高度的复杂性和非线性。但是，由于驱动关节的减少，也相应的减少了成本和能源的消耗。而本文研究的体操机器人就是非线性系统的一个典型例子，用近似动态规划方法能解决单杠体操机器人摆动与平衡的控制问题。本文是通过ADP的方法研究两连杆体操机器人摆动与平衡控制问题，这类机器人是由一个主动关节和一个被动关节构成的。

近似动态规划即ADP，是最优控制领域新兴的一种控制方法，适用于非线性随机系统近似最优解。近似动态规划是根据环境来作出评价信号，并根据评价信号来近似最优控制策略，它不依赖于环境和被控对象的解析模型，因此近似动态规划具有良好的通用性，而且能够在线进行，处理时变和动态变化的复杂系统。而ADP方法在结合其他理论（包含神经网络、自适应评价设计、增强学习和经典动态规划等理论）之后可以完美的克服DP方法中的“维数灾”问题。又因为ADP方法能够获得非线性系统最优控制闭环反馈的控制律,所以ADP被认为是解决非线性系统最优控制的有效方法。

本文主要研究的是通过近似动态规划原理方法来保证体操机器人保持在垂直向上的不稳定点，首先构建相应的评价网络和执行网络，单杠体操机器人系统状态变量作为执行网络和评价网络的输入，并且将执行网络也作为评价网络的输入。而评价网络的输出用来估计代价函数，由该估计值与代价函数的差值来调整评价网络的权值，再通过调整后的评价网络来影响执行网络的权值调整。其中两个网络的误差训练函数都利用贝尔曼最优原理的思想。至于在机器人摆起的阶段则通过能量作为评价指标来进行，即根据能量来调整力矩大小并保持力矩方向与角速度方向一致，直到能量接近体操机器人的最大势能即位于接近垂直向上的位置。并且要控制在机器人摆动上升阶段保证力矩随着机器人的势能增大而减小。

在对单杠体操机器人的摆动与平衡控制实验仿真中发现：体操机器人在上升阶段需要选取一个合适的力矩来保证机器人上升到垂直向上的位置且不会过大。并且体操机器人在控制平衡阶段需要选取合适的参数才能保证体操机器人维持在近似垂直向上的位置一定的时间。

关键词:近似动态规划单杠体操机器人 ADHDP BP神经网络强化学习

Abstract

The single-bar gymnastics robot is a complex underactuated system that is a robot with a non-driven joint. The underactuated system refers to a class of nonlinear systems with the number of input variables of the system less than the degree of freedom of the system . This is the type of single-bar gymnastics robot is this type. Therefore, the swing and balance control of the single-bar gymnastics robot in this paper has a high degree of complexity and non-linearity. However, due to the reduction of the drive joint, the corresponding reduction in cost and energy consumption. In this paper, the gymnastic robot is a typical example of the nonlinear system. The approximate dynamic programming method can solve the control problem of the swing and balance of the single-bar gymnastics robot. This paper is based on the ADP method to study the two-link gymnastics robot swing and balance control problem, this kind of robot is composed of an active joint and a passive joint.

The approximate dynamic programming, ADP, is a new control method in the optimal control field and is suitable for the optimal solution of nonlinear stochastic systems. The approximate dynamic programming is based on the environment to make the evaluation signal, and according to the evaluation signal to approximate the optimal control strategy, it does not depend on the environment and the controlled object of the analytical model, so the approximate dynamic programming has good versatility, and can be carried out online, Dealing with time-varying and dynamic changes in complex systems. The ADP method can overcome the problem of "dimensionality disaster" in the DP method after combining other theories (including neural network, adaptive evaluation design, enhanced learning and classical dynamic programming theory). ADP is considered to be an effective method to solve the optimal control of nonlinear system because ADP method can obtain the control law of optimal control closed-loop feedback of nonlinear system.

In this paper, we mainly study the method of approximating the dynamic programming principle to ensure that the gymnastics robot is kept at the vertical and unstable point. First, the corresponding evaluation network and the execution network and the single-bar gymnastics robot system state variables are constructed as the input network and the evaluation network. The execution network is also used as an input to the evaluation network. And the output of the evaluation network is used to estimate the cost function. The weight of the evaluation network is adjusted by the difference between the estimated value and the cost function, and the weighting of the execution network is affected by the adjusted evaluation network. The error training function of the two networks utilizes the idea of Bellman's optimal principle. As for the stage where the robot is tilted, energy is used as the evaluation index, that is, the torque is adjusted according to the energy and the torque direction is kept in the same direction as the angular velocity until the energy is close to the maximum potential of the gymnastic robot. And to control the robot in the swing phase to ensure that the torque increases with the robot potential energy decreases.

In the simulation experiment of the swing and balance control of the single bar gymnastic robot,we found:gymnastics robot in the ascending phase need to select a suitable torque to ensure that the robot rose to the vertical position and not too large.And Gymnastic robots need to select the appropriate parameters in the control balance phase to ensure that the gymnastic robot is maintained at an approximately vertical upward position for a certain amount of time.

Keywords: Approximate dynamic programming Single bar gymnastics robot

ADHDP BP neural network Strengthen learning

第一章绪论 1

1.1 单杠体操人发展现状 1

1.2 近似动态规划发展现状 2

1.3 强化学习 3

1.4 研究内容 4

第二章近似动态规划方法 5

2.1 动态规划基本概念 5

2.2 动态规划基本原理 7

2.3 ADP方法的应用介绍 9

2.4 ADHDP发展概述 9

2.5 ADHDP基本原理 10

第三章单杠体操机器人模型分析 12

3.1 单杠体操机器人模型 12

3.2两关节单杠体操机器人模型 12

第四章神经网络构建评价器和执行器 15

4.1 BP神经网络 15

4.2 ADHDP控制器的基本框架 15

4.3 ADHDP算法的实现 16

4.4 ADHDP方法中评价网络的训练 17

4.5 ADHDP方法中执行网络的训练 18

第五章基于ADP的单杠体操机器人摆动与平衡控制仿真结果与分析 19

5.1 单杠体操机器人摆动控制仿真研究 19

5.2单杠体操机器人的平衡控制仿真研究 20

5.3 仿真分析 22

第六章结论 23

参考文献 24

致谢 25

第一章绪论

1.1 单杠体操人发展现状

单杠体操机器人系统是一种系统的控制输入变量个数小于自由度个数的欠驱动系统。本文所研究的两关节的单杠体操机器人就是只有一个输入控制量却有两个关节控制要求的欠驱动系统^[1]。

由于欠驱动系统的驱动器的个数比较少，使得它较于全驱动系统不但能够节约成本，降低能量消耗，而且还能增加系统的灵活度。因此，这类系统广泛的应用于工业控制领域。然而，欠驱动系统的驱动器的减少同时使得欠驱动系统的控制设计难度大大增加了，使得对欠驱动系统的研究更加重要。本文研究的两关节单杠体操机器人这类欠驱动系统因为输入控制量只有一个即控制力矩，而要控制的两个关节的运动，因此具有较大的研究难度。需要对这类欠驱动系统进行了解。

您需要先支付 80元 才能查看全部内容！立即支付

注册

找回密码

基于ADP的单杠体操机器人摆动和平衡控制毕业论文

第一章绪论

1.1 单杠体操人发展现状

您可能感兴趣的文章

最新文档

推荐栏目

登录

注册

找回密码

基于ADP的单杠体操机器人摆动和平衡控制毕业论文

第一章 绪论

1.1 单杠体操人发展现状

您可能感兴趣的文章

最新文档

推荐栏目

第一章绪论