Please enter the first word of the second paragraph of the main paper introduction.
Example of synthesized video and reasoning data for training SOLE-R1.
The videos are separated into levels based on the amount of the task completed during the video. The amount of non-expert behavior in the video increases as the level decreases. The highest-level trajectories for each task show full task completion with near-expert behavior. The level 1 trajectories do not achieve any part of the task.