Data Collection
The complete recording workflow for the LinkerBot O6. Leader-follower teleoperation, LeRobot dataset format, episode quality checklist, and links to the broader data collection pipeline.
Recording Workflow for the O6
The complete process from hardware ready to first dataset episode. Follow steps in order.
Verify hardware & CAN interface
Confirm the O6 is mounted, powered on, and the CAN interface is up. Run candump can0 and verify motor heartbeat packets appear before continuing.
Configure task and camera layout
Define the task description, set up the camera(s) at the correct angles, and place task objects in the workspace. A consistent scene setup across episodes is critical for policy generalization.
Start the recording session
Launch the LeRobot control script with the record mode. This arms the system for episode recording. The session will await your start trigger before capturing.
python -m lerobot.scripts.control_robot \
--robot.type=linkerbot_o6 \
--control.type=record \
--control.fps=30 \
--control.repo_id=your-username/o6-task-name \
--control.num_episodes=50 \
--control.single_task="Pick up the blue block"
Teleoperate and record episodes
Use a leader arm (or your keyboard for simple tests) to teleoperate the O6. Press the start/stop key to bracket each episode. Reset the scene between episodes for consistency.
Review and filter episodes
Use the LeRobot replay tool to review each episode visually. Discard any that fail the quality checklist below. Quality over quantity: 30 excellent episodes beats 100 mediocre ones.
python -m lerobot.scripts.control_robot \
--robot.type=linkerbot_o6 \
--control.type=replay \
--control.repo_id=your-username/o6-task-name \
--control.episode=0
Upload to HuggingFace Hub
Push your filtered dataset to the HuggingFace Hub for sharing and training. The dataset is immediately available for policy training in LeRobot.
huggingface-cli login
python -m lerobot.scripts.push_dataset_to_hub \
--repo_id=your-username/o6-task-name
LeRobot Dataset Format for O6
Each recorded episode is stored in the standard LeRobot HuggingFace dataset format. This format is directly compatible with ACT, Diffusion Policy, and all other LeRobot-supported training algorithms.
Episode structure
dataset/
data/
episode_000000/
observation.state.npy # [T, 12] — 6 joint positions + 6 velocities
action.npy # [T, 6] — 6 target joint positions
observation.images.wrist_cam/
frame_000000.png # 640x480 @ 30 fps
...
observation.images.overhead_cam/
frame_000000.png
...
episode.json # {task, success, duration_s, num_frames}
meta_data/
info.json # dataset schema version, robot type, fps
stats.json # per-channel mean, std, min, max
State and action dimensions
# observation.state: [T, 12]
# Columns: [j0_pos, j1_pos, j2_pos, j3_pos, j4_pos, j5_pos,
# j0_vel, j1_vel, j2_vel, j3_vel, j4_vel, j5_vel]
# Units: radians and radians/second
# action: [T, 6]
# Columns: [j0_target, j1_target, j2_target, j3_target, j4_target, j5_target]
# Units: radians
Train a policy from your O6 dataset
python -m lerobot.scripts.train \
--dataset_repo_id=your-username/o6-task-name \
--policy.type=act \
--output_dir=./checkpoints/o6-act-v1 \
--training.num_epochs=100
Episode Quality Checklist
Apply this checklist to every episode before including it in your training dataset. Bad data is worse than less data.
- ✓Task completed successfully — the arm reached the goal state without human intervention. No partial completions.
- ✓Motion is smooth and deliberate — no jerky corrections, overshoots, or sudden direction changes. Smooth demonstrations train smoother policies.
- ✓All camera frames present — no dropped frames, no occlusions of the task-relevant workspace region.
- ✓Joint states are continuous — no timestep gaps greater than 40 ms in the state log.
- ✓Episode duration is consistent — episodes shorter than 3 s or longer than 30 s are usually outliers. Review them before including.
- ✓Scene was reset identically — task objects were returned to the same starting position before the episode began.
- ✓No CAN errors during recording — check
candump can0logs for error frames during the session.