Efficient Continual Imitation Learning with Online Meta-Adapters

King's College London

Abstract

Continual adaptation is essential for general autonomous agents. For example, a household robot pretrained with a repertoire of skills must still learn and adapt to unseen tasks specific to each household. However, prior works mainly emphasize either effective pretraining of models for decision-making or single-task adaptation. Recognizing this, building upon parameter-efficient fine-tuning in language models, recent works have explored lightweight adapters to adapt pretrained policies, which can preserve learned features from the pretraining phase and demonstrate good adaptation performances. However, these approaches treat task learning separately and overlook the underlying relationships between new tasks and prior tasks, limiting the knowledge transfer. In this paper, we propose Online Meta-Adapters (OMA) for continual imitation learning. Rather than applying adapters directly, OMA employs a meta-learning objective to capture transferable priors from prior tasks, thereby accelerating adaptation to new tasks. Extensive experiments in both simulated and real-world environments demonstrate that OMA can lead to better adaptation performances compared to the baseline methods.

Key Results

OMA is evaluated on LIBERO continual adaptation suites and real-robot tasks with 20 demonstrations. Across simulation, OMA consistently outperforms adapter-based baselines including L2M, TAIL, and a multi-task adapter baseline. The experiments also show that OMA remains effective across different demonstration counts, LoRA ranks, and policy architectures.

+17% Average simulation improvement over prior adapter-based baselines.
49.5% LIBERO-10 success rate in the ablation study, above random task and support selection variants.
+19% Average real-robot improvement over TAIL across tested demonstration settings.
OMA continual adaptation results on LIBERO task suites.
Continual adaptation results on LIBERO-OBJECT, LIBERO-SPATIAL, LIBERO-GOAL, and LIBERO-10.
OMA scaling with demonstrations and adapter rank.
OMA consistently improves over TAIL when varying the number of demonstrations and LoRA adapter rank.

Real-Robot Evaluation

The real-world experiments use a Kinova robotic arm with two RealSense RGB-D camera views. The policy is pretrained on five tasks and then continually adapted to five new tasks, testing whether OMA can transfer reusable manipulation knowledge under controlled distribution shifts.

Real-world robot setup and objects used in the experiments.
Real-world setup and objects used for continual adaptation experiments.

Rollout Samples

Real robot rollouts produced by the OMA adapted policy.
Example real-robot rollouts from policies adapted with OMA.
Method 20 demos 40 demos
OMA 38.0% 70.0%
TAIL 32.0% 58.0%