Xun Jiang

I am the last year Ph.D student in Successive Postgraduate and Doctoral Program at CFM lab of the University of Electronic and Science of China (UESTC), supervised by Prof. Heng Tao Shen, co-supervised by Prof. Fumin Shen and Prof. Xing Xu. Before that, I earned my bachelor's degree in Software Engineering from UESTC in 2020, where I was recognized as an Honor Graduate. Now, I am a visiting Ph.D student of MReal Lab at Nanyang Technological University under the supervision of Prof. Hanwang Zhang.

My research interests include but are not limited to, Multimedia Retrieval, Multimodal Learning, Video Content Understanding, and VLM/LLM Applications. Currently, I am serving as a reviewer for IEEE TPAMI, IEEE TIP, IEEE TMM, IEEE TCSVT, ACM TOIS, ACM TOMM, PR, CVPR, ICCV, ACM MM, AAAI, WWW, ICASSP, etc..

Feel free to contact me if you are interested in discussing my research topics!

Email: xun_jiang@outlook.com

Google Scholar / GitHub

Education

Dec. 2024 - Present	Nanyang Technological University (NTU), Singapore College of Computing and Data Science Visiting Ph.D Student in Computer Science and Technology Awarded CSC Scholarship
Dec. 2020 - Present	University of Electronic Science and Technology of China (UESTC), China School of Computer Science and Engineering Ph.D Student in Computer Science and Technology In Successive Postgraduate and Doctoral Program
Sep. 2016 - Jun. 2020	University of Electronic Science and Technology of China (UESTC), China School of Software Engineering Bachelor Degree in Software Engineering Awarded Honor Graduates of UESTC

News

[2025/08] 2 papers about multimodal learning and egocentric video understanding accepted by AAAI 2026! 🎉🎉 Congratulations!
[2025/08] We won ACM MM 2025 Grand Challenge ERR@HRI 2.0! 🎉🎉 Our technical report has been accepted for publication in proceedings.
[2025/07] 1 papers about multimodal learning accepted by ACM MM 2025! 🎉🎉 Congratulations to Disen!
[2025/04] 2 papers about multimodal learning and video event retrieval accepted by ICMR 2025! 🎉🎉 Congratulations!
[2025/03] 4 papers about egocentric video analysis and multimodal learning accepted by ICME 2025! 🎉🎉 Congratulations!
[2025/02] 1 paper about egocentric procedural video verification accepted by CVPR 2025!
[2025/01] Awarded the CAST inaugural Doctoral Student Special Plan of the Young Elite Scientists Sponsorship Program! 🥳

Publications

Currently, I'm working on multimodal learning, egocentric video understanding, and applications of vision-language models.

Geometric Gradient Divergence Modulation for Imbalanced Multimodal Learning
Disen Hu, Xun Jiang, Zhe Sun, Hao Yang, Chong Peng, Peng Yan, Heng Tao Shen, Xing Xu
ACM Internation Conference on Multimedia, ACM MM 2025
[Paperlink], [Code]
Key Words: Multimodal Learning, Imbalanced Multimodal Learning, Hyperspace Polyhedron

Procedural Heterogeneous Graph Completion for Natural Language Task Verification in Egocentric Videos
Xun Jiang, Zhiyi Huang, Xing Xu, Jingkuan Song, Fumin Shen, Heng Tao Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025
[Paperlink], [Code]
Key Words: Natural Language-based Egocentric Task Verification; Heterogeneous Graph Completion; Multimodal Learning; Procedural Task Understanding

Counterfactually Augmented Event Matching for De-biased Temporal Sentence Grounding
Xun Jiang, Zhuoyuan Wei, Shenshen Li, Xing Xu, Jingkuan Song, Heng Tao Shen
ACM Internation Conference on Multimedia, ACM MM 2024
[Paperlink], [Code]
Key Words: Video Content Understanding, De-biased Video Grounding; Counterfactual Reasoning; Multimodal Learning

Zero-Shot Video Moment Retrieval with Angular Reconstructive Text Embeddings
Xun Jiang, Xing Xu, Zailei Zhou, Yang Yang, Fumin Shen, Heng Tao Shen
IEEE Transactions on Multimedia, TMM 2024
[Paperlink], [Code]
Key Words: Video Content Understanding; Weakly-Supervised Learning; CLIP

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion
Zixian Gao*, Xun Jiang* (* equal contribution), Xing Xu, Fumin Shen, Yujie Li, Heng Tao Shen
IEEE/CVF Computer Vision and Pattern Recognition Conference, CVPR 2024
[Paperlink], [Code]
Key Words: Multimodal Learning; Model Robustness; Uncertainty in Deep Learning

Joint Searching and Grounding: Multi-Granularity Video Content Retrieval
Zhiguo Chen*, Xun Jiang* (* equal contribution), Xing Xu, Zuo Cao, Yijun Mo, Heng Tao Shen
ACM Internation Conference on Multimedia, ACM MM 2023
[Paperlink], [Code]
Key Words: Multimedia Retrieval; Video Content Understanding; Multimodal Learning

Multi-Grained Attention Network with Mutual Exclusion for Composed Query-Based Image Retrieval
Shenshen Li, Xing Xu, Xun Jiang, Fumin Shen, Xin Liu, Heng Tao Shen
IEEE Transactions on Circuits and Systems for Video Technology, TCSVT 2023
[Paperlink], [Code]
Key Words: Cross-modal Retrieval; Composed Query-Based Image Retrieval; Multimodal Learning

Faster Video Moment Retrieval with Point-Level Supervision
Xun Jiang, Zailei Zhou, Xing Xu, Yang Yang, Guoqing Wang, Heng Tao Shen
ACM Internation Conference on Multimedia, ACM MM 2023
[Paperlink], [Code]
Key Words: Video Content Retrieval; Point-level Supervision; Retrieval Efficiency

SDN: Semantic Decoupling Network for Temporal Language Grounding
Xun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen
IEEE Transactions on Neural Networks and Learning Systems, TNNLS 2022
[Paperlink], [Code]
Key Words: Video Content Understanding; Vision-Language; Multimodal Learning

DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing
Xun Jiang, Xing Xu, Zhiguo Chen, Jingran Zhang, Jingkuan Song, Fumin Shen, Huimin Lu, Heng Tao Shen,
ACM Internation Conference on Multimedia, ACM MM 2022
[Paperlink], [Code]
Key Words: Video Content Understanding; Action Localization; Audio-Visual Learning

Semi-Supervised Video Paragraph Grounding With Contrastive Encoder
Xun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen
IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
[Paperlink]
Key Words: Video Content Understanding, Semi-Supervised Learning; Multimodal Learning

Awards

First Place at ACM MM 2025 Grand Challenge Track, ERR@HRI 2.0 Sub-Challenge 2, 2025.08
CAST inaugural Doctoral Student Special Plan of the Young Elite Scientists Sponsorship Program, 2025.01
Chinese Scholarship Council Scholarship, 2024.07
Doctoral National Scholarship, 2023.10, 2024.10
"Academic Newcomer" Ph.D Student Honor Award of UESTC, 2023.04
ICME Best Student Paper, 2022.07
"Academic Youth" Postgraduate Student Honor Award of UESTC, 2022.04
Honor Graduates of UESTC, 2020.06

This template is borrowed from Lei Wang's website source code. Many thanks to Lei Wang.