SOLE-R1: Video-Language Reasoning as the Sole Reward for On-Robot RL

Code, Models, & Data

We are continuing to add more examples to this site. Example videos below show SOLE-R1 frame-level reasoning and progress prediction for the zero-shot online RL experiments.
Also, to more clearly illustrate what the outputs of our video and reasoning synthesis approach look like (to generate SOLE-R1 training data), we provide video demonstrations at: https://sole-r1.github.io/synthesized_video_reasoning_examples/