SPIME Lab Inc.
Reinforcement Learning
: An Interactive Tutorial
Select a problem or Create New:
Discount Factor
Learning Rate
Explore vs. Exploit
Iterations
Run
Step
Download
Solution:
Agent-Enviroment Interaction Steps:
Randomly pick a State
decide whether to
Explore
or
Exploit
Take an action to recieve a
Reward
Update the
Quality of action
using reward and past experience.
Rewards
Q Table