CONTINUAL REINFORCEMENT LEARNING FOR DEXTEROUS MANIPULATION

Introduction
Dexterous manipulation is a challenging robotics problem involving multi-fingered hands or end-effectors that must learn to grasp, reorient, and manipulate objects under complex contact dynamics. Reinforcement learning (RL) has shown strong potential for such tasks, but standard RL methods often struggle when new tasks, objects, or conditions are introduced over time. Continual reinforcement learning (CRL) offers a way to acquire new manipulation skills while retaining previously learned ones.
Objectives
· Study CRL for dexterous manipulation tasks.
· Compare conventional RL retraining with CRL approaches for sequential skill acquisition.
· Evaluate forgetting, transfer, and adaptation across multiple manipulation tasks, objects, or task conditions.
· Analyze whether continual RL can improve robustness and efficiency in dexterous manipulation settings.
Tasks
1. Literature Review: Review RL, CRL, and dexterous manipulation.
2. Simulation Setup: Select or build a simulated dexterous manipulation environment with a sequence of manipulation tasks, such as grasping, reorientation, object handover, in-hand rotation, or tool use.
3. Baseline Models: Train a standard RL policy for individual tasks and establish baselines using sequential retraining or fine-tuning.
4. Continual RL Methods: Implement and compare two or more continual RL strategies.
5. Evaluation: Measure task success rate, retention of previous skills, forward transfer to new tasks, adaptation speed, and sample efficiency. Additional analysis may include robustness to new objects, contact conditions, or sensory noise.
6. Optional Extension: Investigate one of the following:
· Language-conditioned manipulation, where the robot learns multiple hand skills from task instructions.
· Vision-language or foundation-model guidance for dexterous skill sequencing or reward design.
· Sim-to-real considerations for transferring continual manipulation policies to physical hardware.
Pre-requisites
Python, reinforcement learning, machine learning, and interest in robotics or dexterous manipulation. Experience with simulation environments is a plus.
Work
20% Theory, 60% Programming/Simulations, 20% Writing
Contact
Ali Sabzi Khoshraftar (a.sabzikhoshraftar@utwente.nl)