Hierarchical Semantic Conditioning for Pose-aware Object Manipulation
Sep 1, 2025ยท
,,,,,ยท
0 min read
Chongyang Xu
Cheng Shen
Haipeng Li
Haoqiang Fan
Ziliang Feng
Shuaicheng Liu
Abstract
By lifting DINOv2 and Stable Diffusion features into hierarchical 3D semantic fields, this work explicitly models interactions between object parts to enable fine-grained understanding in diffusion-based policies, overcoming the semantic limitations of purely geometric representations for pose-aware manipulation.
Type
Publication
Submitted to ICRA 2026

Authors
Ph.D. Student @ Sichuan University
Embodied AI Intern @ Tongyi Robotics
Embodied AI Intern @ Tongyi Robotics
Hello! ๐
I’m Chongyang, a researcher who’s into physical AI & robotics, equally passionate about sports, music, humanities, and sociology. I’m doing multimodal learning and reinforcement learning in the grandest simulator of all: life โ one episode at a time, learning what’s worth the strife.
I’m openly seeking collaborations โ if you have any research ideas or projects, feel free to reach out!
Education ๐
I’ve been studying at Sichuan University for 7 years and have fallen deeply in love with Chengdu. I received my B.Eng. in Software Engineering and am now pursuing my Ph.D. in Computer Science.