LM101-062: How to Transform a Supervised Learning Machine into a Value Function Reinforcement Learning Machine
Listen now
Description
This 62nd episode of Learning Machines 101 (www.learningmachines101.com)  discusses how to design reinforcement learning machines using your knowledge of how to build supervised learning machines! Specifically, we focus on Value Function Reinforcement Learning Machines which estimate the unobservable total penalty associated with an episode when only the beginning of the episode is observable. This estimated Value Function can then be used by the learning machine to select a particular action in a given situation to minimize the total future penalties that will be received. Applications include: building your own robot, building your own automatic aircraft lander, building your own automated stock market trading system, and building your own self-driving car!!
More Episodes
This 86th episode of Learning Machines 101 discusses the problem of assigning probabilities to a possibly infinite set of observed outcomes in a space-time continuum which corresponds to our physical world. The machine learning algorithm uses information about the frequency of environmental...
Published 07/20/21
Published 07/20/21
This 85th episode of Learning Machines 101 discusses formal convergence guarantees for a broad class of machine learning algorithms designed to minimize smooth non-convex objective functions using batch learning methods. Simple mathematical formulas are presented based upon research from the late...
Published 05/21/21