"Agent-Based Control Prompt Tuning for Video-Text Retrieval"

Large-scale image-text pre-trained models have shown promising transferability to various downstream tasks. Video-text retrieval benefits from it by transferring pre-trained CLIP to video-text domain. Although these pre-trained models have shown impressive performance, full fine-tuning becomes prohibitively expensive as the size of these pre-trained models grows rapidly. To solve this, parameter-efficient tuning methods have been proposed, and prompt tuning is one of the most promising directions. However, existing prompt tuning methods do not have sufficient performance due to the lack of cross-modal interaction and prompt reliability assurance. To address these issues, we propose an effective and efficient Agent-based Control Prompt Tuning method (AbC-PT) for parameter-efficient video-text retrieval. The proposed AbC-PT enjoys several merits. Firstly, we design a parameter-efficient agent decoder with a carefully designed consistent attention mechanism to effectively capture video temporal information, mine contextual texts and perform cross-modal interaction between them. Secondly, we introduce two different sets of prompts, i.e., the vanilla prompt prepended to the input tokens and the concept prompt as the agent of the agent decoder. In addition, to ensure cross-modal semantic consistency of the concept prompt, we design a semantic consistency constraint loss. Thirdly, we devise a parameter-free prompt controller for adaptively calibrating each vanilla prompt based on its semantic in a data-driven way. Extensive experiments on five challenging benchmarks demonstrate that our method not only outperforms state-of-the-art parameter-efficient tuning methods, but even surpasses the full fine-tuning with 0.46% parameter overhead.

Keywords: prompt tuning; video text; text retrieval; agent

Journal Title: IEEE Transactions on Circuits and Systems for Video Technology
Year Published: 2025

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended