Articles with "image audio" as a keyword



Multimodal diffusion framework for collaborative text image audio generation and applications

Sign Up to like & get
recommendations!
Published in 2025 at "Scientific Reports"

DOI: 10.1038/s41598-025-05794-4

Abstract: This paper presents a novel framework for collaborative generation across text, image, and audio modalities using an enhanced diffusion model architecture. We introduce a Hierarchical Cross-modal Alignment Network that establishes unified representations while preserving modality-specific… read more here.

Keywords: diffusion; generation; text image; framework collaborative ... See more keywords

Multimodal Fusion Remote Sensing Image–Audio Retrieval

Sign Up to like & get
recommendations!
Published in 2022 at "IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing"

DOI: 10.1109/jstars.2022.3194076

Abstract: Remote sensing image–audio retrieval (RSIAR) has been an emerging research topic in recent years, and many different methods have been proposed for this topic. These RSIAR methods have achieved good retrieval results, but two problems… read more here.

Keywords: fusion; remote sensing; image audio; retrieval ... See more keywords

Fine Aligned Discriminative Hashing for Remote Sensing Image-Audio Retrieval

Sign Up to like & get
recommendations!
Published in 2023 at "IEEE Transactions on Geoscience and Remote Sensing"

DOI: 10.1109/tgrs.2023.3269300

Abstract: For cross-modal remote sensing image-audio (RSIA) retrieval task, hashing technology has attracted much attention in recent works. Most of them focus on mapping RS images and audios into a Hamming space, whilst neglecting discriminative information… read more here.

Keywords: information; remote sensing; image audio; retrieval ... See more keywords