登录
首页 » matlab » WindyGridWorldQLearning

WindyGridWorldQLearning

于 2013-04-19 发布 文件大小:2KB
0 196
下载积分: 1 下载次数: 31

代码说明:

  Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for Q,-learning based on that outlined in Watkins (1989). We show that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed each iteration, rather than just one.

文件列表:

下载说明:请别用迅雷下载,失败请重下,重下不扣分!

发表评论

0 个回复

  • hht_toolbox_20040808
    提供hht工具箱完整的程序给大家参考学习(To provide a complete toolbox hht procedures to which we refer to learning)
    2009-11-03 20:12:18下载
    积分:1
  • matlabtosolve
    基于MATLAB实现的说话人识别程序,分别用bp、pnn、som、rbf、lvq等算法,对语音文件进行训练和测试,效果不错。~..~ 下面说明一下bprengong程序: 数据分别用来训练和测试两部分。 具体程序分为两部分,第一部分为:计算识别模型 变量v是mfcc处理以后的矢量。因为数据可能长短不一,所以放在同一进行截取。p的每一行代表一个语音数据(共15个)。变量Pr为每一行的最大最小值。变量T为目标值。输出神经元个数为15。 在训练阶段,如果用于训练的输入训练样本的类别标号为i(即语音数据的标号),则训练时设第i个节点的期望输出设为1。其余节点期望输出均为0。 在识别时,当一个未知类别的样本作用到输入端时,考查各输出节点的输出,并将这个样本的类别判定为输出值最大的那个节点对应的类别。 (err)
    2008-04-16 16:07:34下载
    积分:1
  • thesissimCDMAOFDM
    OFDM的matlab仿真程序,非常好的入门程序,对了解OFDM系统有很好地引导作用(OFDM Matlab simulation program, very good induction procedures, to understand OFDM system is a good guide)
    2007-01-11 16:28:54下载
    积分:1
  • linefuzzycontrol
    该程序介绍了MATLAB中怎么样生成模糊控制规则与控制量查询(The program introduces MATLAB, how kind of fuzzy control rules and control the generation of inquiries)
    2010-09-16 10:37:30下载
    积分:1
  • 201003tecplot
    说明:  简介卫星姿态控制系统的模型,并针对该模型分析故障发生的特点,故 障诊断的难点,以及可能发生故障的部位和故障的类型,给出故障检测的方法。(Introduction of satellite attitude control system model and the model for the analysis of the characteristics of faults, fault diagnosis of the difficulties, and possible failure location and fault type, fault detection methods are given.)
    2010-03-23 22:02:43下载
    积分:1
  • PLLSim
    说明:  二阶锁相环Matlab仿真代码,如入两路信号和信噪比,输出锁相以后的信号。可以仿真初始频差,和频率斜升的情况(second-order PLL Matlab simulation code, such as two-way signals and signal to noise ratio, the output signal after the lock-in. Simulation can initial frequency difference, and frequency ramp-up of)
    2006-03-21 21:06:29下载
    积分:1
  • matlab7ju
    用matlab做的七个矩的代码!!绝对可用!已经试过 (Matlab to do with the seven moments of code! ! Absolutely free! Have tried)
    2010-05-31 17:09:38下载
    积分:1
  • mm1
    这是一个实现mm1排队系统的程序,希望对大家有所帮助(This is a mm1 queuing system program, we hope to help)
    2012-06-17 20:56:04下载
    积分:1
  • IEEE802.15.4a
    the IR-UWB Channel model coded in Matlab simulation
    2014-10-10 20:05:38下载
    积分:1
  • garbortexture
    有关gabor滤波 纹理特征提取 并且有很好的效果 可以试试(reference on gaborfiter and texture characteristics and works well)
    2015-03-12 20:48:05下载
    积分:1
  • 696516资源总数
  • 106554会员总数
  • 12今日下载