Photo from wikipedia
Sign Up to like & get
recommendations!
0
Published in 2024 at "International Journal of Computer Vision"
DOI: 10.1007/s11263-025-02440-4
Abstract: Recent advancements in multimodal fusion have witnessed the remarkable success of vision-language (VL) models, which excel in various multimodal applications such as image captioning and visual question answering. However, building VL models requires substantial hardware…
read more here.
Keywords:
efficient vision;
fusion;
vision;
language ... See more keywords
Sign Up to like & get
recommendations!
0
Published in 2025 at "Scientific Reports"
DOI: 10.1038/s41598-025-25199-7
Abstract: Identifying emotional states in animals is a key challenge in behavioural science and a prerequisite for developing reliable welfare assessments, ethical frameworks, and robust human–animal communication models. Recently, large vision-language models (LVLMs) such as GPT-4o,…
read more here.
Keywords:
emotion;
large vision;
language;
vision language ... See more keywords
Photo from wikipedia
Sign Up to like & get
recommendations!
0
Published in 2025 at "Advanced Robotics"
DOI: 10.1080/01691864.2025.2487608
Abstract: Various conditions exist in individual daily life environments. It is important for a daily life support robot to observe states in the daily life environment and perform tasks depending on the living environment. Today, pre-trained…
read more here.
Keywords:
pre trained;
environment;
daily life;
language ... See more keywords
Sign Up to like & get
recommendations!
0
Published in 2025 at "IEEE Access"
DOI: 10.1109/access.2025.3535837
Abstract: The fast advancement of Large Vision-Language Models (LVLMs) has shown immense potential. These models are increasingly capable of tackling abstract visual tasks. Geometric structures, particularly graphs with their inherent flexibility and complexity, serve as an…
read more here.
Keywords:
benchmark generator;
large vision;
language;
vision language ... See more keywords
Sign Up to like & get
recommendations!
1
Published in 2022 at "IEEE Journal of Biomedical and Health Informatics"
DOI: 10.1109/jbhi.2022.3163751
Abstract: Pathology visual question answering (PathVQA) attempts to answer a medical question posed by pathology images. Despite its great potential in healthcare, it is not widely adopted because it requires interactions on both the image (vision)…
read more here.
Keywords:
vision language;
question;
language;
pathology ... See more keywords
Photo from wikipedia
Sign Up to like & get
recommendations!
0
Published in 2024 at "IEEE Journal of Biomedical and Health Informatics"
DOI: 10.1109/jbhi.2024.3462653
Abstract: Retinopathy is a group of retinal disabilities that causes severe visual impairments or complete blindness. Due to the capability of optical coherence tomography to reveal early retinal abnormalities, many researchers have utilized it to develop…
read more here.
Keywords:
language correlation;
proposed framework;
framework;
language ... See more keywords
Sign Up to like & get
recommendations!
0
Published in 2025 at "IEEE journal of biomedical and health informatics"
DOI: 10.1109/jbhi.2025.3631270
Abstract: Vision-Language Models (VLMs) have demonstrated impressive capabilities across various medical tasks, including report generation and visual question answering (VQA). However, pixel-level tasks such as image segmentation remain relatively underexplored, despite their critical importance for clinical…
read more here.
Keywords:
medical vision;
semantic interaction;
language;
vision language ... See more keywords
Sign Up to like & get
recommendations!
0
Published in 2025 at "IEEE Internet of Things Journal"
DOI: 10.1109/jiot.2025.3624038
Abstract: Autonomous vehicles face challenges in complex environments due to the computational inefficiency of large language models (LLMs) and the lack of multiagent collaboration in existing decision-making approaches. This article proposes a small vision–language model (VLM)-based…
read more here.
Keywords:
small vision;
decision making;
language;
vision language ... See more keywords
Sign Up to like & get
recommendations!
0
Published in 2025 at "IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing"
DOI: 10.1109/jstars.2025.3617915
Abstract: Remote sensing image classification models face significant challenges when adapting to new domains due to variations in image acquisition conditions, sensor types, and scene categories. Conventional domain adaptation methods rely on multistage adaptation pipelines with…
read more here.
Keywords:
adaptation;
image;
remote sensing;
language ... See more keywords
Sign Up to like & get
recommendations!
0
Published in 2024 at "IEEE Robotics and Automation Letters"
DOI: 10.1109/lra.2024.3483042
Abstract: Vision-and-Language Navigation (VLN) has garnered widespread attention and research interest due to its potential applications in real-world scenarios. Despite significant progress in the VLN field in recent years, limitations persist. Many agents struggle to make…
read more here.
Keywords:
navigation;
history;
knowledge;
language ... See more keywords
Sign Up to like & get
recommendations!
0
Published in 2026 at "IEEE Robotics and Automation Letters"
DOI: 10.1109/lra.2025.3629984
Abstract: Vision-language-action models (VLAs) use an end-to-end learning architecture, which can realize the integration of visual perception, semantic understanding and motion control. However, when tackling with the dynamic or long-horizon tasks, VLAs have poor robustness and…
read more here.
Keywords:
task;
language;
vision language;
reinforcement learning ... See more keywords