LAUSR: text audio

Distinguishing apathy and depression in older adults with mild cognitive impairment using text, audio, and video based on multiclass classification and shapely additive explanations

Sign Up to like & get
recommendations!
1 Published in 2022 at "International Journal of Geriatric Psychiatry"

DOI: 10.1002/gps.5827

Abstract: This study aimed to develop a classification model to detect and distinguish apathy and depression based on text, audio, and video features and to make use of the shapely additive explanations (SHAP) toolkit to increase… read more here.

Keywords: audio video; classification; text audio; additive explanations ... See more keywords

TA2V: Text-Audio Guided Video Generation

Sign Up to like & get
recommendations!
0 Published in 2024 at "IEEE Transactions on Multimedia"

DOI: 10.1109/tmm.2024.3362149

Abstract: Recent conditional and unconditional video generation tasks have been accomplished mainly based on generative adversarial network (GAN), diffusion, and autoregressive models. However, in some circumstances, using only one modality cannot provide enough semantic information. Therefore,… read more here.

Keywords: text audio; video generation; video;

Towards Weakly Supervised Text-to-Audio Grounding

Sign Up to like & get
recommendations!
0 Published in 2024 at "IEEE Transactions on Multimedia"

DOI: 10.1109/tmm.2024.3443614

Abstract: Text-to-audio grounding (TAG) task aims to predict the onsets and offsets of sound events described by natural language. This task can facilitate applications such as multimodal information retrieval. This paper focuses on weakly-supervised text-to-audio grounding… read more here.

Keywords: weakly supervised; text audio; level; supervised text ... See more keywords

LAUSR

You are not signed in:

Sign Up!

Distinguishing apathy and depression in older adults with mild cognitive impairment using text, audio, and video based on multiclass classification and shapely additive explanations

TA2V: Text-Audio Guided Video Generation

Towards Weakly Supervised Text-to-Audio Grounding