|
Publications
* indicates equal contribution
|
|
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
Sule Bai, Mingxing Li, Yong Liu, Jing Tang, Haoji Zhang, Lei Sun, Xiangxiang Chu, Yansong Tang
arXiv Preprint, 2025
[Paper]
[Code]
[Project Page]
|
|
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai*, Yong Liu*, Yifei Han, Haoji Zhang, Yansong Tang, Jie Zhou, Jiwen Lu
IEEE Transactions on Image Processing (TIP) (CCF-A, IF=13.7), 2025
[Paper]
[Code]
|
|
Narrative Action Evaluation with Prompt-Guided Multimodal Interaction
Sule Bai*, Shiyi Zhang*, Guangyi Chen, Lei Chen, Jiwen Lu, Junle Wang, Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper]
[Code]
|
|
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
Yong Liu*, Sule Bai*, Guanbin Li, Yitong Wang, Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper]
[Code]
|
|
Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation
Yong Liu*, Songli Wu*, Sule Bai*, Jiahao Wang, Yansong Tang
IEEE/CVF International Conference on Computer Vision (ICCV), 2025
[Paper]
|
|
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
Haochen Wang, Xiangtai Li, Zilong Huang, Anran Wang, Jiacong Wang, Tao Zhang, Jiani Zheng, Sule Bai, Zijian Kang, Jiashi Feng, Zhuochen Wang, Zhaoxiang Zhang
International Conference on Learning Representations (ICLR), 2026
[Paper]
[Code]
|
|
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
Haoji Zhang, Xin Gu, Jiawen Li, Chixiang Ma, Sule Bai, Chubin Zhang, Bowen Zhang, Zhichao Zhou, Dongliang He, Yansong Tang
arXiv Preprint, 2025
[Paper]
[Code]
|
|