Publications
* indicates equal contribution
|
|
UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
Sule Bai, Mingxing Li, Yong Liu, Jing Tang, Haoji Zhang, Lei Sun, Xiangxiang Chu, Yansong Tang
arXiv Preprint, 2025
[Paper]
[Code]
[Project Page]
|
|
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai*, Yong Liu*, Yifei Han, Haoji Zhang, Yansong Tang
arXiv Preprint, 2024
[Paper]
[Code]
|
|
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
Yong Liu*, Sule Bai*, Guanbin Li, Yitong Wang, Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper]
[Code]
|
|
Narrative Action Evaluation with Prompt-Guided Multimodal Interaction
Sule Bai*, Shiyi Zhang*, Guangyi Chen, Lei Chen, Jiwen Lu, Junle Wang, Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper]
[Code]
|
|
Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation
Yong Liu*, Songli Wu*, Sule Bai*, Jiahao Wang, Yansong Tang
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
[Paper]
|
|
Thinking With Videos: Multimodal Tool-Augmented Reinforcement Learning for Long Video Reasoning
Haoji Zhang, Xin Gu, Jiawen Li, Chixiang Ma, Sule Bai, Chubin Zhang, Bowen Zhang, Zhichao Zhou, Dongliang He, Yansong Tang
arXiv Preprint, 2025
[Paper]
[Code]
|
|
Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology
Haochen Wang, Xiangtai Li, Zilong Huang Anran Wang, Jiacong Wang, Tao Zhang, Jiani Zheng, Sule Bai, Zijian Kang, Jiashi Feng, Zhuochen Wang, Zhaoxiang Zhang
arXiv Preprint, 2025
[Paper]
[Code]
|
|