搜索结果: 1-1 共查到“信息处理技术 active backup”相关记录1条 . 查询时间(0.109 秒)
Exploration is used in Q_learning because the agent will be caught in locally optimal policies due to blind exploitation.However excessive exploration will degrade the performance of Q_learning and it...