LAUSR: efficient vision

Photo from wikipedia

ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning

Sign Up to like & get
recommendations!
0 Published in 2024 at "International Journal of Computer Vision"

DOI: 10.1007/s11263-025-02440-4

Abstract: Recent advancements in multimodal fusion have witnessed the remarkable success of vision-language (VL) models, which excel in various multimodal applications such as image captioning and visual question answering. However, building VL models requires substantial hardware… read more here.

Keywords: efficient vision; fusion; vision; language ... See more keywords

LAUSR

You are not signed in:

Sign Up!

ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning