MASTER Assignment
towards reducing the sample complexity of a model-free reinforcement learning agent controlling a single segment tendon-driven continuum manipulator
Type : Master M-SC
Period: Jan, 2019- Sep, 2020
Student : Hendriks, K.J.H. (Kasper, Student M-SC)
Date Final project: September 24, 2020
Supervisors:
Abstract:
This work aims to outline an end-to-end process to develop a practically viable reinforcement learning controller based on the soft actor-critic algorithm by reducing its sample complexity. A tendon-driven continuum manipulator is fabricated and then modelled using a non-linear autoregressive exogenous neural network. This model is used to generate a student policy that imitates expert behaviour as well as the policy of a model-free agent trained in simulation. The simulated agent’s policy and the student policy are used to initialise a model-free learner, with the intent of reducing the sample complexity by allowing the agent to focus on fine-tuning an already competent policy. The effectiveness of these methods is evaluated by comparing the performance as a function of learning time with that of an agent that was trained without any prior knowledge. Results indicate that while the endoscope is able to learn a reaching task, the sparsity of information about the state-space in the student policy and the model inaccuracies used to develop the simulated agent lead to performances that were similar or worse for a given number of training steps.