About Me
I am a master student in Computer Science at Tongji University, expected to graduate in March 2026. My research primarily focuses on Multimodal Intelligence, including Multimodal Retrieval and Generation, Multimodal Large Language Models, and Multi-Agent Interaction.
My work aims to advance the capability of AI systems to understand and reason across multiple modalities while addressing challenges such as privacy preservation, trustworthiness, and model efficiency. Recently, my work has been centered around the following areas:
- Retrieval-Augmented Memory Agents
- World Models for Agent Reasoning
- Post-training of Multimodal Large Language Models
π’ Actively Seeking 2026 Fall PhD Positions
I am actively seeking PhD opportunities starting Fall 2026. I would be thrilled to work with prospective advisors and research groups. Please feel free to contact me.
News
- [01/2026] Our paper DMM has been accepted by ICASSP 2026! ππ
- [01/2026] Our paper AMID has been accepted by WWW 2026! ππ
- [11/2025] Our paper ReBrain has been accepted by WACV 2026! ππ
- [08/2025] Our paper COGO has been accepted by PRCV 2025! ππ
- [07/2025] Our paper HM-RAG has been accepted by ACM MM 2025! ππ
- [07/2025] Our paper VaLiK has been accepted by ICCV 2025! ππ
- [01/2025] Joined Shanghai Artificial Intelligence Laboratory as an Intern! β‘οΈβ‘οΈ
Selected Publications

Junming Liu, S Meng, Y Gao, S Mao, P Cai, G Yan, Y Chen, Z Bian, D Wang, B Shi

P Liu, X Liu, R Yao, Junming Liu, S Meng, D Wang, J Ma.
For a complete list of publications, please refer to my publications page.
Internships
- 2023.06 - 2023.09, Embedded Engineer, Shanghai NIO Automobile Co., Ltd.
- Controlled steering and braking systems under tire blowout scenarios using image and radar data to prevent rollover and loss of control.
- 2025.01 - Present, Research Scientist, Shanghai Artificial Intelligence Laboratory
- Conducted research on multimodal large language models.
Professional Service
I serve as a reviewer for ICME, SMC, and AAAI, contributing to the peer-review process in the fields of computer vision, multimodal AI, and machine learning.
Skills & Tools
- Programming & Frameworks: : C, C++, Python, Java, Go, SQL, Rust, MPI, NCCL, DeepSpeed, DDP, FSDP
- Languages: Chinese (native), English (IELTS 7.0), Japanese, German
