SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning
要約
Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fundamental challenge for vision-language models (VLMs). Tool-augmented agents attempt to address this by augmenting VLMs with specialist perception modules, yet their effectiveness is …