Sign Up to like & get
recommendations!
1
Published in 2022 at "IEEE Transactions on Image Processing"
DOI: 10.1109/tip.2022.3142526
Abstract: Due to the rich spatio-temporal visual content and complex multimodal relations, Video Question Answering (VideoQA) has become a challenging task and attracted increasing attention. Current methods usually leverage visual attention, linguistic attention, or self-attention to…
read more here.
Keywords:
video question;
temporal semantic;
spatio temporal;
attention ... See more keywords