Weihang Li

Hi, I'm a PhD student with TUM CAMP, PRS, MCML supervised by Prof. Benjamin Busam and Prof. Nassir Navab. During my study, I conducted research at CAMP, PRS with Prof. Olaf Wysocki, HKUST-GZ with Prof. Haoang Li and CVG with Prof. Daniel Cremers.

My research interests lie in the interplay between 3D computer vision and robotics, with a focus on camera/object localization, 3D/4D reconstruction, depth estimation, neural scene representations and robot grasping. I am also broadly interested in large language models (LLMs), multi-modal learning combining vision and language, and Embodied AI.

Email / Google Scholar / Github / Linkedin

News

[10-2025]    We are co-organizing workshop and challenge on Category-Level Object Pose Estimation in the Wild and Transparent & Reflective objects In the wild Challenges in ICCV 2025.
[07-2025]    Our paper HouseCat-TRICKY has been accepted to ICCVW 2025.
[04-2025]    Our paper Texture2LoD3 has been accepted to CVPRW 2025.
[02-2025]    Our paper GCE-Pose has been accepted to CVPR 2025.
[10-2024]    I successfully defended my Master's Thesis (GCE-Pose) at CAMP with the highest grade, 1.0.
[09-2024]    Our paper SCRREAM has been accepted to NeurIPS 2024.
[07-2024]    Our paper kb-pbd has been accepted to IROS 2024.
[06-2024]    Our team received an Honorable Mention Award in the S23DR Challenge at CVPR 2024.

Research

	Texture2LoD3: Enabling LoD3 Building Reconstruction With Panoramic Images Wenzhao Tang, Weihang Li, Xiucheng Liang, Olaf Wysocki, Filip Biljecki, Christoph Holst, Boris Jutzi Computer Vision and Pattern Recognition Conference Workshop on Urban Scene Modeling (CVPRW), 2025 arXiv / Project Page / Code Texture2LoD3 proposes leveraging ubiquitous street-level images and low-level building models for accurate ortho-texturing (left): Enabling accurate semantic segmentation (center) and facade-rich LoD3 reconstruction (right).
	GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation Weihang Li, Hongli Xu, Junwen Huang, HyunJun Jung, Peter KT Yu, Nassir Navab, Benjamin Busam Computer Vision and Pattern Recognition Conference (CVPR), 2025 arXiv / Project Page A semantic shape reconstruction module that recovers complete object geometry from partial observations with a global context-enhanced feature fusion mechanism that leverages category-level semantic and shape priors for robust pose prediction
	DynSUP: Dynamic Gaussian Splatting from An Unposed Image Pair Weihang Li, Weirong Chen, Shenhan Qian, Benjamin Busam, Daniel Cremers, Haoang Li arXiv , 2024 arXiv / Project Page / Code A novel method to achieve Gaussian splatting from an un-posed image pair in dynamic environments.
	SCRREAM: SCan, Register, REnder And Map: A Framework for Annotating Accurate and Dense 3D Indoor Scenes with a Benchmark HyunJun Jung, Weihang Li, Shun-Cheng Wu, William Bittner, Nikolas Brasch, Jifei Song, Eduardo Pérez-Pellitero, Zhensong Zhang, Arthur Moreau, Nassir Navab, Benjamin Busam In Proceedings of the Neural Information Processing Systems (NeurIPS), 2024 arXiv / Project Page / Code A framework to annotate accurate and dense 3d indoor scenes with a benchmark on novel view synthesis and SLAM
	Knowledge-based Programming by Demonstration using semantic action models for industrial assembly Junsheng Ding, Haifan, Zhang, Weihang Li, Liangwei Zhou, Alexander Perzylo International Conference on Intelligent Robots and Systems (IROS), 2024 Paper / Project Page / Code / Video A knowledge-based Programming by Demonstration (kb-PbD) paradigm to facilitate robot programming in small and medium-sized enterprises (SMEs).

Experiences

TUM Photogrammetry & Remote Sensing	Mentor: Olaf Wysocki	04/2024 - 03/2025
fortiss Robotics Lab	Mentors: Junsheng Ding, Alexander Perzylo	10/2022 - 09/2023

Teaching

Teaching Assistant

Academic Services

Conference Reviewer: CVPR, ICCV, IROS, NeurIPS

Last updated: July 2025