RoboPanoptes:

The All-Seeing Robot with Whole-body Dexterity

Xiaomeng Xu1    Dominik Bauer2    Shuran Song1,2

1Stanford University      2Columbia University

Paper Video (YouTube) Code (Coming Soon) Hardware (Coming Soon)

RoboPanoptes is a robot system that utilizes all of its body parts to sense and interact with the environment, enabling novel manipulation capabilities such as sweeping multiple or oversized objects, unboxing in narrow spaces, and precise multi-step stowing in cluttered environments.

1x

1x

1x

1x

Technical Summary Video


System Overview

RoboPanoptes achieves whole-body dexterity through whole-body vision. Its whole-body dexterity allows the robot to utilize its entire body surface for manipulation, such as leveraging multiple contact points or navigating constrained spaces. Meanwhile, whole-body vision uses a camera system distributed over the robot's surface to provide comprehensive, multi-view observations of its environment and visual feedback of its own state.
At its core, RoboPanoptes uses a whole-body visuomotor policy that learns complex manipulation skills directly from human demonstrations, efficiently aggregating information from the camera system while maintaining resilience to sensor failures.


Experiments

(1) Unboxing 📦

Task The robot is tasked with opening a box with a lid: locate the small side hole of the box, enter the box through the hole, drag the box closer, extend its body inside the box, lift the lid, and slide the lid aside to fully open the box.
Comparisons [Top-down Camera] policy struggles to determine the correct reaching height. [Head Camera] policy moves in the wrong direction. [w/o Camera Pose Encoding] policy struggles to open the lid, and at locating and entering the hole. [w/o Blink Training] policy underperforms in scenarios with camera dropout.

1x

Ours

1x

Ours

1x

Ours

1x

Ours

1x

Top-down Camera

1x

Head Camera

1x

w/o Camera Pose Encoding

1x

w/o Blink Training
All Policy Rollouts

(2) Sweeping 🧱

Task The robot is tasked to sweep all objects (small or large, randomly placed on a table or under the shelf) into a target region around its base.
Comparisons [Top-down Camera] policy fails to detect objects under the shelf and often knocks down tall objects. [Neck Cameras] policy struggles with objects located behind the robot due to self-occlusion.

1x

Ours

1x

Ours

1x

Ours

1x

Top-down Camera

1x

Top-down Camera

1x

Neck Cameras
Efficiency Comparisons With whole-body sweeping, the robot manipulates object piles much more efficiently by leveraging multiple contacts (2s/block for teleoperation and 3.2s/block for rollouts), as compared to using only the end-effector to move objects one by one (9.3s/block for teleoperation).

1x

Teleop EE Only

1x

Teleop Whole-body

1x

Rollout Whole-body
All Policy Rollouts

(3) Stowing 👟

Task The robot performs a sequence of actions to stow shoes in a drawer: hook and pull the drawer handle to open it, pick up a pair of shoes (one by one), place the shoes inside the drawer, and push the drawer to close it.
Comparisons [Top-down Camera] policy fails to locate the handle. [w/o Camera Pose Encoding] policy's actions are less precise, leading to errors like missing the shoe or misaligning the drawer.

1x

Ours

1x

Top-down Camera

1x

Ours

1x

w/o Camera Pose Encoding
All Policy Rollouts

Camera Views Visualizations

We visualize views from all 21 cameras during policy rollouts. Note that, in in the sweeping (small), robopanoptes is robust to the frame drops in the top view.

1x

1x

1x

1x

1x

Unboxing

1x

Sweeping (Small)

1x

Sweeping (Large)

1x

Stowing

Citation


@article{xu2025robopanoptes,
	title={RoboPanoptes: The All-seeing Robot with Whole-body Dexterity},
	author={Xu, Xiaomeng and Bauer, Dominik and Song, Shuran},
	journal={arXiv preprint arXiv:2501.05420},
	year={2025}
}							  
						

Contact

If you have any questions, please feel free to contact Xiaomeng Xu.

Acknowledgement

The authors would like to thank Han Zhang, Ken Wang, Yifan Hou, Haoyu Xiong, Haochen Shi, Mengda Xu, Cheng Chi, Zhenjia Xu, and Adrian Wong for their advice on the design of the robot and implementation of hardware. We thank Huy Ha for his help with visualizations and video editing. In addition, we would like to thank all REALab members: Mandi Zhao, Zeyi Liu, Max Du, Hojung Choi, Austin Patel, Shuang Li, Chuer Pan, Yihuai Gao, John So, and Eric Liang for fruitful discussions. This work was supported in part by NSF Award #2143601, #2037101, and #2132519. Dominik Bauer is partially supported by the Austrian Science Fund (FWF) under project # J 4683. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the sponsors.