A study on rollout mechanisms in model-based reinforcement learning
How much do we want to use the model to roll out in every step?
In which direction, forward or backward, would be more beneficial to roll out in every step?
With insights identified in the first two objectives, can we design an automated rollout mechanism that takes full advantage of the two properties of rollout mechanisms?
The set of benchmarks planned to be used to conduct experiments
Environments with discrete state space and discrete action space. The goal of an agent is to reach the goal state. This set of environments is ideal for large-scale experimentation
A more complex set of locomotion control problems with continuous state space and continuous action space. Problems to be experimented on to verify findings in grid world environments
Supervisor
BEng(CS) final year student