Hierarchical Semantic Conditioning for Pose-aware Object Manipulation

Sep 1, 2025ยท
Chongyang Xu
Chongyang Xu
,
Cheng Shen
,
Haipeng Li
,
Haoqiang Fan
,
Ziliang Feng
,
Shuaicheng Liu
ยท 0 min read
Abstract
By lifting DINOv2 and Stable Diffusion features into hierarchical 3D semantic fields, this work explicitly models interactions between object parts to enable fine-grained understanding in diffusion-based policies, overcoming the semantic limitations of purely geometric representations for pose-aware manipulation.
Type
Publication
Submitted to ICRA 2026
publications
Chongyang Xu
Authors
๐ŸŽ“ Ph.D. Student @ Sichuan University
๐Ÿ”ญ Embodied AI Intern @ Tongyi Robotics

Hello! ๐Ÿ‘‹

I’m Chongyang, a researcher who’s into physical AI & robotics, equally passionate about sports, music, humanities, and sociology. I’m doing multimodal learning and reinforcement learning in the grandest simulator of all: life โ€” one episode at a time, learning what’s worth the strife.

I’m openly seeking collaborations โ€” if you have any research ideas or projects, feel free to reach out!

Education ๐ŸŽ“

I’ve been studying at Sichuan University for 7 years and have fallen deeply in love with Chengdu. I received my B.Eng. in Software Engineering and am now pursuing my Ph.D. in Computer Science.