RAI Institute

Published: December 01, 2022

Full-time · 3+ yrs · Joined as employee #20, helped grow to 275+ Cambridge, Massachusetts, United States

Sep 2023 – Present · On-site

Foundation Models & Infrastructure

Founded two research teams: Foundation Models (10+ researchers) and Capture (30 people)
Built 320 H100 GPU cluster for large-scale training of video prediction and multimodal models
Created Theia, a vision foundation model distilling multiple pretrained models into a compact representation (CoRL 2024)
Built video prediction system using internet-scale video as prior (early 2023)
Designed multimodal architectures combining diverse sensor modalities with internet-scale pretrained priors

Data Quality & Scale

Lead the largest data collection effort at the institute with a roadmap for 100,000+ demonstrations
Built handheld force-based data collection system capturing force, vision, and proprioception
Created task definition frameworks and benchmark protocols for consistent, high-quality demonstrations
Established research partnerships with Google, Columbia, ETH Zurich, Agile Robots

Novel RL & Learning Methods

Developed gradient-free RL enabling online learning with non-differentiable semantic reward functions (U.S. Patent pending, May 2025)
Created task definition framework that makes human demonstrations more learnable
Built ROS 2 interfaces and controllers for custom grippers leveraging novel force/torque sensors

Research & Publications

Published at CoRL 2024, IJCAI 2024, ICML 2023
Co-authoring “Elephants Don’t Write Sonnets” (MIT Press 2026) with Stefanie Tellex
Co-organized New England Manipulation Symposium (NEMS) 2025 at MIT

Dec 2022 – Aug 2023 · 9 mos

Joined as one of the first employees to build foundation models research capability from scratch
Developed early video prediction approach leveraging internet-scale video data
Pitched and launched Foundation Models team
Created benchmark protocol for multiple teams based on state-of-the-art manipulation datasets