論文 Hugging Face 発表: 2026-06-09 HF ↑4

World Model Self-Distillation: Training World Models to Solve General Tasks

著者: Sebastian Stapf, Pablo Acuaviva Huertos, Aram Davtyan, Paolo Favaro

要約

Pretrained video generators are promising visual world models that exhibit emergent task-solving abilities; however, their reliance on detailed textual descriptions limits their direct use for planning and decision-making. Existing approaches either outsource this reasoning to language or vision-lan…

#rl#multimodal#robotics#benchmark#diffusion

World Model Self-Distillation: Training World Models to Solve General Tasks

要約

同じカテゴリの記事

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment

World-R1: テキストから動画生成における3D制約の強化学習による整合