RAI Institute
Published:
Full-time · 3+ yrs · Joined as employee #20, helped grow to 275+ Cambridge, Massachusetts, United States
Research Lead
Sep 2023 – Present · On-site
Foundation Models & Infrastructure
- Founded two research teams: Foundation Models (10+ researchers) and Capture (30 people)
- Built 320 H100 GPU cluster for large-scale training of video prediction and multimodal models
- Created Theia, a vision foundation model distilling multiple pretrained models into a compact representation (CoRL 2024)
- Built video prediction system using internet-scale video as prior (early 2023)
- Designed multimodal architectures combining diverse sensor modalities with internet-scale pretrained priors
Data Quality & Scale
- Lead the largest data collection effort at the institute with a roadmap for 100,000+ demonstrations
- Built handheld force-based data collection system capturing force, vision, and proprioception
- Created task definition frameworks and benchmark protocols for consistent, high-quality demonstrations
- Established research partnerships with Google, Columbia, ETH Zurich, Agile Robots
Novel RL & Learning Methods
- Developed gradient-free RL enabling online learning with non-differentiable semantic reward functions (U.S. Patent pending, May 2025)
- Created task definition framework that makes human demonstrations more learnable
- Built ROS 2 interfaces and controllers for custom grippers leveraging novel force/torque sensors
Research & Publications
- Published at CoRL 2024, IJCAI 2024, ICML 2023
- Co-authoring “Elephants Don’t Write Sonnets” (MIT Press 2026) with Stefanie Tellex
- Co-organized New England Manipulation Symposium (NEMS) 2025 at MIT
Research Scientist
Dec 2022 – Aug 2023 · 9 mos
- Joined as one of the first employees to build foundation models research capability from scratch
- Developed early video prediction approach leveraging internet-scale video data
- Pitched and launched Foundation Models team
- Created benchmark protocol for multiple teams based on state-of-the-art manipulation datasets
