登录
首页 » matlab » WindyGridWorldQLearning

WindyGridWorldQLearning

于 2013-04-19 发布 文件大小:2KB
0 132
下载积分: 1 下载次数: 31

代码说明:

  Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states. This paper presents and proves in detail a convergence theorem for Q,-learning based on that outlined in Watkins (1989). We show that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action-values are represented discretely. We also sketch extensions to the cases of non-discounted, but absorbing, Markov environments, and where many Q values can be changed each iteration, rather than just one.

文件列表:

下载说明:请别用迅雷下载,失败请重下,重下不扣分!

发表评论


0 个回复

  • cdma_code
    a cdma code for communication
    2010-10-11 12:31:27下载
    积分:1
  • RetrieveL2
    In this i attached matlab code for image retrieval function by finding the distance between training and testing features and arrange it in according to ascending order.
    2013-10-06 15:08:52下载
    积分:1
  • Foundations-of-Fuzzy-Control---A-Practical-Approa
    Foundations of Fuzzy Control - A Practical Approach (2e) (ccorrections)
    2015-02-07 12:09:55下载
    积分:1
  • matlabforarrayprocessing
    用于阵列信号处理的常用代码 包括经典处理和自适应处理 代码准确无误都能运行 可读性强(matlab code for array processing)
    2021-04-29 10:38:43下载
    积分:1
  • Py
    说明:  使用蛙跳算法来解决一个工厂流程问题的事例(Frog hopping algorithm solves factory process problems)
    2019-03-30 09:11:20下载
    积分:1
  • DOA_simple
    方位估计是阵列信号处理的重要内容,使用单个矢量传感器估计信号方位近几年才发展起来(DOA estimation array signal processing is an important aspect of using a single vector-sensor estimate signal directions only in recent years developed)
    2010-01-22 18:01:38下载
    积分:1
  • m-files
    newton steepest descent quazi newton
    2012-04-15 22:44:21下载
    积分:1
  • Blur3
    A Survey of Gaussian Convolution Algorithms 高斯卷积方面的综述文档和源码 可以供相关人员参考(A Survey of Gaussian Convolution Algorithms digital image processing)
    2014-09-02 19:13:58下载
    积分:1
  • assembler_all_in_al12372628734l
    its a c program for assembler
    2014-12-23 14:41:24下载
    积分:1
  • WFM-r43
    This package is very important for WRF (Weather and Research Forecast ) installation ARWpost in ubuntu
    2015-02-09 17:11:43下载
    积分:1
  • 696518资源总数
  • 105028会员总数
  • 6今日下载