RAI Institute

Published:

Full-time · 3+ yrs · Joined as employee #20, helped grow to 275+ Cambridge, Massachusetts, United States

Research Lead

Sep 2023 – Present · On-site

Foundation Models & Infrastructure

  • Founded two research teams: Foundation Models (10+ researchers) and Capture (30 people)
  • Built 320 H100 GPU cluster for large-scale training of video prediction and multimodal models
  • Created Theia, a vision foundation model distilling multiple pretrained models into a compact representation (CoRL 2024)
  • Built video prediction system using internet-scale video as prior (early 2023)
  • Designed multimodal architectures combining diverse sensor modalities with internet-scale pretrained priors

Data Quality & Scale

  • Lead the largest data collection effort at the institute with a roadmap for 100,000+ demonstrations
  • Built handheld force-based data collection system capturing force, vision, and proprioception
  • Created task definition frameworks and benchmark protocols for consistent, high-quality demonstrations
  • Established research partnerships with Google, Columbia, ETH Zurich, Agile Robots

Novel RL & Learning Methods

  • Developed gradient-free RL enabling online learning with non-differentiable semantic reward functions (U.S. Patent pending, May 2025)
  • Created task definition framework that makes human demonstrations more learnable
  • Built ROS 2 interfaces and controllers for custom grippers leveraging novel force/torque sensors

Research & Publications

  • Published at CoRL 2024, IJCAI 2024, ICML 2023
  • Co-authoring “Elephants Don’t Write Sonnets” (MIT Press 2026) with Stefanie Tellex
  • Co-organized New England Manipulation Symposium (NEMS) 2025 at MIT

Research Scientist

Dec 2022 – Aug 2023 · 9 mos

  • Joined as one of the first employees to build foundation models research capability from scratch
  • Developed early video prediction approach leveraging internet-scale video data
  • Pitched and launched Foundation Models team
  • Created benchmark protocol for multiple teams based on state-of-the-art manipulation datasets