Description
Large foundation models (LFMs), spanning language, vision, and multimodal systems, are revolutionizing interactive robot learning. These models enable robots to learn from human interactions, environments, and other robots, advancing capabilities such as path planning, grasping, walking, swarm coordination, and manipulation. LFMs hold the potential to enhance robot autonomy in dynamic and unstructured environments, making significant strides in applications such as autonomous vehicles, AI-driven manufacturing, healthcare robotics, and advanced air mobility. This workshop will focus on cutting-edge research that explores the integration of LFMs into robotic systems, addressing both their potential and the challenges they pose.
Plans to Encourage Participation
To encourage participation, we will implement interactive breakout sessions, where senior researchers and early-career participants will collaborate on specific subtopics. This will allow for diverse perspectives on the use of LFMs in robot learning. Additionally, the workshop will feature a mentorship program pairing early-career researchers with senior experts, fostering collaboration and knowledge exchange. We will also offer online participation options to ensure accessibility for those unable to attend in person. Special efforts will be made to promote diversity by targeting underrepresented groups through targeted outreach to women-in-robotics groups and other diversity-focused research networks.
Discussion Topics
Online Participation.
The event will include hybrid participation, with live streaming and interactive Q&A sessions for virtual participants. A Slack/Discord channel will also be set up for continued discussion before, during, and after the workshop.
Important Date
Submission Due: May 18, 2025
Notification Day: May 25, 2025
Submission Contents and Format
This workshop welcome all the research reports on the latest preliminary results, recently-submitted drafts, and recently-published work. The page limit is two pages including reference. The format will be the stardard IEEE conference format. Review will be conducted by the organization committee as well as some domain scholars. All submissions, please email draft to: [email protected], with email subject "submission for RSS workshop". After committee review, authors will be notified before the notification day.
Large foundation models (LFMs), spanning language, vision, and multimodal systems, are revolutionizing interactive robot learning. These models enable robots to learn from human interactions, environments, and other robots, advancing capabilities such as path planning, grasping, walking, swarm coordination, and manipulation. LFMs hold the potential to enhance robot autonomy in dynamic and unstructured environments, making significant strides in applications such as autonomous vehicles, AI-driven manufacturing, healthcare robotics, and advanced air mobility. This workshop will focus on cutting-edge research that explores the integration of LFMs into robotic systems, addressing both their potential and the challenges they pose.
Plans to Encourage Participation
To encourage participation, we will implement interactive breakout sessions, where senior researchers and early-career participants will collaborate on specific subtopics. This will allow for diverse perspectives on the use of LFMs in robot learning. Additionally, the workshop will feature a mentorship program pairing early-career researchers with senior experts, fostering collaboration and knowledge exchange. We will also offer online participation options to ensure accessibility for those unable to attend in person. Special efforts will be made to promote diversity by targeting underrepresented groups through targeted outreach to women-in-robotics groups and other diversity-focused research networks.
Discussion Topics
- Enhancing robot learning from multimodal data using LFMs.
- LFM applications in path planning, manipulation, and swarm robotics.
- Multi-robot coordination and advanced teaming strategies.
- LFM used in real-world applications (self-driving, manufacturing, and healthcare).
- Integration of LFMs with robotic hardware for perception and control.
- Ethical and safety concerns in deploying LFMs in critical environments.
Online Participation.
The event will include hybrid participation, with live streaming and interactive Q&A sessions for virtual participants. A Slack/Discord channel will also be set up for continued discussion before, during, and after the workshop.
Important Date
Submission Due: May 18, 2025
Notification Day: May 25, 2025
Submission Contents and Format
This workshop welcome all the research reports on the latest preliminary results, recently-submitted drafts, and recently-published work. The page limit is two pages including reference. The format will be the stardard IEEE conference format. Review will be conducted by the organization committee as well as some domain scholars. All submissions, please email draft to: [email protected], with email subject "submission for RSS workshop". After committee review, authors will be notified before the notification day.
Invited Speakers
Sergey Levine
UC Berkeley
https://rail.eecs.berkeley.edu/
Talk: "Robotic Foundation Models"
Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as applications in other decision-making domains. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.
Yilun Du
Harvard & Google DeepMind
https://yilundu.github.io/
Talk: "Learning Compositional Models of the World"
Yilun Du is an assistant professor at Harvard in the Kempner Institute and Computer Science. He was previously a senior research scientist at Google Deepmind and received his PhD from MIT. He focuses on generative models, decision-making, robot learning, and embodied agents, with a particular interest in developing intelligent embodied agents in the physical world. He holds a PhD from MIT EECS and previously worked at OpenAI, FAIR, and Google DeepMind.
Brian Ichter
Physical Intelligence
https://www.physicalintelligence.company/
Talk: "Large Language Models for Learning"
Brian Ichter is a co-founder of Physical Intelligence (Pi), a company focused on integrating general-purpose AI into the physical world. His interests lie in leveraging machine learning and large-scale models to enable robots to perform general tasks in real-world environments.
Chuchu Fan
MIT
https://aeroastro.mit.edu/realm/
"Robot Planning with Natural Language Inputs and LLMs"
Dr. Chuchu Fan is an Associate Professor in the Department of Aeronautics and Astronautics at the Massachusetts Institute of Technology (MIT), where she leads the Reliable Autonomous Systems Lab (REALM). Her research integrates formal methods, control theory, and machine learning to ensure the safety and reliability of autonomous systems. Dr. Fan has explored the application of Large Language Models (LLMs) in complex planning tasks, enhancing the decision-making capabilities of autonomous agents.
Amanda Prorok
Cambirdge
https://proroklab.org/
"LLM for multi-robot coordination"
Dr. Amanda Prorok is a Professor of Collective Intelligence and Robotics in the Department of Computer Science and Technology at the University of Cambridge, where she leads the Prorok Lab. Her research focuses on developing coordination strategies for multi-agent and multi-robot systems, integrating machine learning, planning, and control methodologies. Dr. Prorok has made significant contributions to the field, including pioneering methods for differentiable communication between learning agents and advancing algorithms for cooperative perception and coordinated path planning. Her work has broad applications in automated transport, logistics, and environmental monitoring.
Deepak Pathak
CMU
https://www.ri.cmu.edu/robotics-groups/pathak-research-group/
"Large-Scale Pretraing and Continual Adaptation in Robot Learning"
Dr. Deepak Pathak is the Raj Reddy Assistant Professor in the School of Computer Science at Carnegie Mellon University. He is a member of the Robotics Institute and affiliated with the Machine Learning Department. Dr. Pathak's research spans artificial intelligence at the intersection of computer vision, machine learning, and robotics, focusing on large-scale pretraining and continual adaptation.
Jason Ma
UPenn
https://jasonma2016.github.io/
“Foundation Model Supervision for Robot Learning”
Jason Ma is a final-year Ph.D. student at the University of Pennsylvania's GRASP Laboratory, advised by Professors Dinesh Jayaraman and Osbert Bastani. His research focuses on reinforcement learning and robot learning, with a particular interest in training and deploying foundation models for robotics and embodied agents.
Fei Miao
UCONN
http://feimiao.org/
"Multi-agent reinforcement learning and LLM for Embodied AI"
Dr. Fei Miao is the Pratt & Whitney Associate Professor in the School of Computing at the University of Connecticut, with a courtesy appointment in Electrical & Computer Engineering. She leads research at the intersection of control theory, machine learning, and game theory, focusing on the safety, efficiency, and security of cyber-physical systems, particularly in autonomous vehicles and intelligent transportation.
Wen Sun
Cornell
https://wensun.github.io/
“Imitation Learning and Reinforcement Learning via Diffusion Models”
Dr. Wen Sun is an Assistant Professor in the Computer Science Department at Cornell University, leading the Reinforcement Learning group. His research focuses on developing novel reinforcement learning algorithms with applications in real-world problems. Dr. Sun completed his Ph.D. at Carnegie Mellon University's Robotics Institute and was a postdoctoral researcher at Microsoft Research NYC.
Shuran Song
Stanford
https://real.stanford.edu/
“Scalable Data Collection for Robot Foundation Models”
Dr. Shuran Song is an Assistant Professor of Electrical Engineering at Stanford University, where she leads the Robotics and Embodied AI Lab (REAL). Her research lies at the intersection of computer vision and robotics, aiming to develop algorithms that enable intelligent systems to learn from interactions with the physical world.
Yunzhu Li
Columbia
https://yunzhuli.github.io/
"Foundation Models for Robotic Manipulation: Opportunities & Challenges"
Dr. Yunzhu Li is an Assistant Professor of Computer Science at Columbia University. His research focuses on incorporating foundation models into robotic manipulation, aiming to enhance task specification and scene modeling in robotics. Dr. Li completed his Ph.D. at MIT and was a postdoctoral researcher at Stanford's Vision and Learning Lab.
Jesse Thomason
USC
https://glamor-usc.github.io/
"LLM for robot learning"
Dr. Jesse Thomason is an Assistant Professor at the University of Southern California, leading the GLAMOR (Grounding Language in Actions, Multimodal Observations, and Robots) Lab. His research integrates natural language processing and robotics to connect language to the world, focusing on language grounding and lifelong learning through interaction.
Beomjoon Kim
Korea Advanced Institute of Science and Technology (KAIST)
https://hugelab.org/
"Hierarchical and modular neural network for manipulation skill discovery"
Dr. Beomjoon Kim is an Associate Professor in the Graduate School of AI at the Korea Advanced Institute of Science and Technology (KAIST). He directs the Humanoid Generalization (HuGe) Lab, focusing on creating general-purpose humanoids capable of efficient decision-making in complex environments.
Lawson Wong
Northeastern
https://www.ccs.neu.edu/home/lsw/grail.html
"Sense-Plan-Act with Foundation Models"
Dr. Lawson L.S. Wong is an Assistant Professor in the Khoury College of Computer Sciences at Northeastern University, based in Boston. He leads the Generalizable Robotics and Artificial Intelligence Laboratory (GRAIL), focusing on learning, representing, and estimating knowledge about the world that autonomous robots can utilize.
Roberto Martín-Martín
UT Austin
https://robin-lab.cs.utexas.edu/
"Mixed-Initiative LLM-powered Dialogue for Collaborative Human-Robot Mobile Manipulation Tasks"
Dr. Roberto Martín-Martín is an Assistant Professor of Computer Science at the University of Texas at Austin. His research integrates robotics, computer vision, and machine learning, focusing on enabling robots to operate autonomously in human-centric, unstructured environments such as homes and offices.
Tentative Schedule
8:30 - 9:00 Opening Remarks
9:00 - 10:00 Group 1: LLM for Robot Reasoning
10:00 - 10:30 Coffee break
10:30 - 11:30 Group 2: LLM for Human Engagement
11:30 - 12:30 Group 3: LLM for Scalable Collaboration
12:30 - 14:00 Lunch break
14:00 - 15:00 Group 4: LLM for Interaction
15:30 - 16:00 Coffee break
16:00 - 17:00 Panel Discussion: Bridging Theory and Practice
17:00 - 17:30 Spotlight Session
17:30 - 18:30 Open Discussion
18:30 Latest end time
UC Berkeley
https://rail.eecs.berkeley.edu/
Talk: "Robotic Foundation Models"
Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as applications in other decision-making domains. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more.
Yilun Du
Harvard & Google DeepMind
https://yilundu.github.io/
Talk: "Learning Compositional Models of the World"
Yilun Du is an assistant professor at Harvard in the Kempner Institute and Computer Science. He was previously a senior research scientist at Google Deepmind and received his PhD from MIT. He focuses on generative models, decision-making, robot learning, and embodied agents, with a particular interest in developing intelligent embodied agents in the physical world. He holds a PhD from MIT EECS and previously worked at OpenAI, FAIR, and Google DeepMind.
Brian Ichter
Physical Intelligence
https://www.physicalintelligence.company/
Talk: "Large Language Models for Learning"
Brian Ichter is a co-founder of Physical Intelligence (Pi), a company focused on integrating general-purpose AI into the physical world. His interests lie in leveraging machine learning and large-scale models to enable robots to perform general tasks in real-world environments.
Chuchu Fan
MIT
https://aeroastro.mit.edu/realm/
"Robot Planning with Natural Language Inputs and LLMs"
Dr. Chuchu Fan is an Associate Professor in the Department of Aeronautics and Astronautics at the Massachusetts Institute of Technology (MIT), where she leads the Reliable Autonomous Systems Lab (REALM). Her research integrates formal methods, control theory, and machine learning to ensure the safety and reliability of autonomous systems. Dr. Fan has explored the application of Large Language Models (LLMs) in complex planning tasks, enhancing the decision-making capabilities of autonomous agents.
Amanda Prorok
Cambirdge
https://proroklab.org/
"LLM for multi-robot coordination"
Dr. Amanda Prorok is a Professor of Collective Intelligence and Robotics in the Department of Computer Science and Technology at the University of Cambridge, where she leads the Prorok Lab. Her research focuses on developing coordination strategies for multi-agent and multi-robot systems, integrating machine learning, planning, and control methodologies. Dr. Prorok has made significant contributions to the field, including pioneering methods for differentiable communication between learning agents and advancing algorithms for cooperative perception and coordinated path planning. Her work has broad applications in automated transport, logistics, and environmental monitoring.
Deepak Pathak
CMU
https://www.ri.cmu.edu/robotics-groups/pathak-research-group/
"Large-Scale Pretraing and Continual Adaptation in Robot Learning"
Dr. Deepak Pathak is the Raj Reddy Assistant Professor in the School of Computer Science at Carnegie Mellon University. He is a member of the Robotics Institute and affiliated with the Machine Learning Department. Dr. Pathak's research spans artificial intelligence at the intersection of computer vision, machine learning, and robotics, focusing on large-scale pretraining and continual adaptation.
Jason Ma
UPenn
https://jasonma2016.github.io/
“Foundation Model Supervision for Robot Learning”
Jason Ma is a final-year Ph.D. student at the University of Pennsylvania's GRASP Laboratory, advised by Professors Dinesh Jayaraman and Osbert Bastani. His research focuses on reinforcement learning and robot learning, with a particular interest in training and deploying foundation models for robotics and embodied agents.
Fei Miao
UCONN
http://feimiao.org/
"Multi-agent reinforcement learning and LLM for Embodied AI"
Dr. Fei Miao is the Pratt & Whitney Associate Professor in the School of Computing at the University of Connecticut, with a courtesy appointment in Electrical & Computer Engineering. She leads research at the intersection of control theory, machine learning, and game theory, focusing on the safety, efficiency, and security of cyber-physical systems, particularly in autonomous vehicles and intelligent transportation.
Wen Sun
Cornell
https://wensun.github.io/
“Imitation Learning and Reinforcement Learning via Diffusion Models”
Dr. Wen Sun is an Assistant Professor in the Computer Science Department at Cornell University, leading the Reinforcement Learning group. His research focuses on developing novel reinforcement learning algorithms with applications in real-world problems. Dr. Sun completed his Ph.D. at Carnegie Mellon University's Robotics Institute and was a postdoctoral researcher at Microsoft Research NYC.
Shuran Song
Stanford
https://real.stanford.edu/
“Scalable Data Collection for Robot Foundation Models”
Dr. Shuran Song is an Assistant Professor of Electrical Engineering at Stanford University, where she leads the Robotics and Embodied AI Lab (REAL). Her research lies at the intersection of computer vision and robotics, aiming to develop algorithms that enable intelligent systems to learn from interactions with the physical world.
Yunzhu Li
Columbia
https://yunzhuli.github.io/
"Foundation Models for Robotic Manipulation: Opportunities & Challenges"
Dr. Yunzhu Li is an Assistant Professor of Computer Science at Columbia University. His research focuses on incorporating foundation models into robotic manipulation, aiming to enhance task specification and scene modeling in robotics. Dr. Li completed his Ph.D. at MIT and was a postdoctoral researcher at Stanford's Vision and Learning Lab.
Jesse Thomason
USC
https://glamor-usc.github.io/
"LLM for robot learning"
Dr. Jesse Thomason is an Assistant Professor at the University of Southern California, leading the GLAMOR (Grounding Language in Actions, Multimodal Observations, and Robots) Lab. His research integrates natural language processing and robotics to connect language to the world, focusing on language grounding and lifelong learning through interaction.
Beomjoon Kim
Korea Advanced Institute of Science and Technology (KAIST)
https://hugelab.org/
"Hierarchical and modular neural network for manipulation skill discovery"
Dr. Beomjoon Kim is an Associate Professor in the Graduate School of AI at the Korea Advanced Institute of Science and Technology (KAIST). He directs the Humanoid Generalization (HuGe) Lab, focusing on creating general-purpose humanoids capable of efficient decision-making in complex environments.
Lawson Wong
Northeastern
https://www.ccs.neu.edu/home/lsw/grail.html
"Sense-Plan-Act with Foundation Models"
Dr. Lawson L.S. Wong is an Assistant Professor in the Khoury College of Computer Sciences at Northeastern University, based in Boston. He leads the Generalizable Robotics and Artificial Intelligence Laboratory (GRAIL), focusing on learning, representing, and estimating knowledge about the world that autonomous robots can utilize.
Roberto Martín-Martín
UT Austin
https://robin-lab.cs.utexas.edu/
"Mixed-Initiative LLM-powered Dialogue for Collaborative Human-Robot Mobile Manipulation Tasks"
Dr. Roberto Martín-Martín is an Assistant Professor of Computer Science at the University of Texas at Austin. His research integrates robotics, computer vision, and machine learning, focusing on enabling robots to operate autonomously in human-centric, unstructured environments such as homes and offices.
Tentative Schedule
8:30 - 9:00 Opening Remarks
9:00 - 10:00 Group 1: LLM for Robot Reasoning
10:00 - 10:30 Coffee break
10:30 - 11:30 Group 2: LLM for Human Engagement
11:30 - 12:30 Group 3: LLM for Scalable Collaboration
12:30 - 14:00 Lunch break
14:00 - 15:00 Group 4: LLM for Interaction
15:30 - 16:00 Coffee break
16:00 - 17:00 Panel Discussion: Bridging Theory and Practice
17:00 - 17:30 Spotlight Session
17:30 - 18:30 Open Discussion
18:30 Latest end time
Organizing Committee
Rui Liu, Ph.D.
Assistant Professor College of Aeronautics and Engineering Kent State University, Ohio USA [email protected] https://ruiliurobotics.weebly.com/ Carlo Pinciroli, Ph.D.
Associate Professor Department of Robotics Engineering Worcester Polytechnic Institute [email protected] https://carlo.pinciroli.net/ Changjoo Nam, Ph.D.
Associate Professor Department of Electronic Engineering Sogang University, Seoul South Korea [email protected] https://sites.google.com/site/changjoonam/ |
Wenhao Luo, Ph.D.
Assistant Professor Department of Computer Science University of Illinois Chicago, IL USA [email protected] https://www.cs.uic.edu/~wenhao/ Xiaoli Zhang, Ph.D.
Associate Professor Department of Mechanical Engineering Colorado School of Mines, Colorado USA [email protected] https://xzhanglab.mines.edu/ |
Jiaoyang Li, Ph.D.
Assistant Professor Robotics Institute Carnegie Mellon University, PA USA [email protected] https://jiaoyangli.me/ Anqi Li, Ph.D.
Research Scientist Nvidia, California USA [email protected] https://anqili.github.io/ |