Zijie Xin 辛梓杰

AI/CS Ph.D. student at Renmin University of China, Beijing

yak2.jpg

email: xinzijie@ruc.edu.cn

I am a first-year Ph.D. student in the AI & Media Computing Lab at the Renmin University of China, advised by Prof. Xirong Li.

I obtained my Bachelor’s degree with honors in the Top-notch Program (a class of 15 elite students selected from 400+) from Sichuan University in 2024, under the supervision of Prof. Qijun Zhao. I’ve interned at Tencent and KuaiShou.

My research primarily revolves around multi-media learning, video understanding, cross-modal retrieval, and open-set recognition, complemented by a broad curiosity in generative model, RAG, RL, and LLM.

News

Jul 7, 2025 I joined Tencent as a research internship on video understanding.
Jun 26, 2025 Our two papers on Music Grounding by Short Video and Sketch Animation have been accepted to ICCV 2025! I’m proud to be the first author of MGSV. 🎉
Mar 21, 2025 Our one paper on Text-based Person Search has been accepted to ICME 2025! Congratulations to Yuchuan! 🎉
Sep 7, 2024 I officially started my PhD at RUC under the supervision of Professor Xirong Li in the AIMC Lab. 👨‍🎓
Jun 28, 2024 I successfully graduated with my bachelor’s degree from SCU and been honored as an Outstanding Graduate of Sichuan University! 🎉
Feb 27, 2024 Our one paper on a Multi-Grained Teaching Strategy for Efficient Text-to-Video Retrieval has been accepted to CVPR 2024! 🎉
Nov 29, 2023 I joined Kuaishou as a research internship on video-music retrieval.
Oct 8, 2023 After heading to Beijing and joining the AI & Media Computing Lab, I unofficially started my PhD adventure! :nerd_face:
Apr 20, 2023 I joined GeWu-Lab as a short-term intern.

Publications

* Equal Contribution | † Corresponding Author

  1. Music Grounding by Short Video
    Zijie Xin, Minquan Wang, Jingyu Liu, Ye Ma, Quan ChenPeng Jiang, and Xirong Li†
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
  2. Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
    Jingyu LiuZijie Xin, Yuhan Fu, Ruixiang Zhao, Bangxiang Lan, and Xirong Li†
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
  3. Holistic Features are almost Sufficient for Text-to-Video Retrieval
    Kaibin Tian*Ruixiang Zhao*Zijie Xin, Bangxiang Lan, and Xirong Li†
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
  4. DAPL: Integration of Positive and Negative Descriptions in Text-Based Person Search
    Yuchuan DengZhanpeng HuZijie Xin, Chuang Deng, and Qijun Zhao†
    In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 2025
  5. LPD.png
    Learning Partially-Decorrelated Common Spaces for Ad-hoc Video Search
    Fan HuZijie Xin, and Xirong Li†
    2025
Flag Counter