Principal Investigators
Inclusive Dates 
09/02/2024 to 01/02/2025

Background

Recent years have seen rapid advancements in robotic hardware and accelerated computing, enabling robot capability to approach more general purpose applications. Traditional approaches for programming industrial robots tend to struggle when faced with applications that require high dexterity and generalizability. In this work, SwRI explored a novel policy learning approach using diffusion models to generate dexterous robot behaviors learned from human demonstrations. This approach represents a fundamental shift in how we will program robots in the future and provides an opportunity for highly dexterous and adaptable robot capability with reduced development effort.

Approach

The goal of this effort was to investigate the feasibility of deploying diffusion policy in industry. We trained, deployed, and evaluated learned robot polices on two tasks: 1) cup arrangement, and 2) industrial assembly. The cup arrangement task included simple manipulations with limited range of variability, but served as a testbed to understand practical limitations and underlying learning mechanisms of the approach. The assembly task required more fine-grained manipulations and was designed to stress the policy capability. The underlying policy learning framework used a diffusion model to generate robot trajectories directly from input camera observations, deployed online to enable closed-loop behavior. Training data consisted of trajectories demonstrated by a human using a matched-embodiment data collection rig (Fig. 1). We deployed the trained policies and evaluated success rates on each task with real robot hardware.

Four photos laid out in a grid with left photo showing example data collection and right photos showing robotic setup

Figure 1: Example data collection (left) and robotic setup (right) for both the cup arrangement (top) and assembly (bottom) tasks.

Accomplishments

Overall, the policy was extremely successful on the cup arrangement task, with some success on the industrial assembly task. Key strengths of the deployed policies included adaptability to visual and spatial task variations, recovery from failure, robustness to disturbances, and ability to ignore distractor objects.

This work allowed us to sufficiently explore this approach and identify the key limitations regarding practical implementation. We believe there is a path to resolving these limitations, and, if realized, has the potential to greatly increase the dexterity and generalizability of our current robotic technology offerings, while simultaneously reducing development effort.