Various conditions exist in individual daily life environments. It is important for a daily life support robot to observe states in the daily life environment and perform tasks depending on… Click to show full abstract
Various conditions exist in individual daily life environments. It is important for a daily life support robot to observe states in the daily life environment and perform tasks depending on the living environment. Today, pre-trained vision-language models have been developed and are good at the general interpretation of images. With these backgrounds, we propose a method to classify situations in real daily life environments for situation-aware task execution using the pre-trained vision-language model. Our classifier requires no additional training and is robust to minor pose changes of objects and robots. In our experiments, we have successfully clustered a variety of situations, ranging from object situations to human actions, and executed tasks based on the situation by mapping cluster results to tasks. GRAPHICAL ABSTRACT
               
Click one of the above tabs to view related content.