Abstract. Our paper presents the development of a real-time system based on detection, classification, and position estimation of objects in an outdoor environment to provide the visually impaired individuals with… Click to show full abstract
Abstract. Our paper presents the development of a real-time system based on detection, classification, and position estimation of objects in an outdoor environment to provide the visually impaired individuals with a voice output-based scene perception. The system is low-cost, light weight, simple, and easily wearable. An odroid board integrated with an USB camera and USB laser is utilized for the purpose. To reduce utility problems, a user-centered design approach has been acquired in which feedback from various individuals was obtained to understand their problems and requirements. The valuable insights gained from the feedback were then used to modify the system to best suit the requirements of the user. The object detection framework exploits a multimodal feature fusion-based deep learning architecture using edge, multiscale as well as optical flow information. Fusing edge information with raw data is motivated from the fact that stronger edge regions result in a higher number of activated neurons, hence inducing better feature representations. Learning deep features from multiple scales as well as use of motion dynamics at feature level lead to better semantic and discriminative representations, thus providing robustness to the detection framework. Experimental results carried out using PASCAL VOC 2007 dataset, Caltech dataset as well as captured real-time data are demonstrated.
               
Click one of the above tabs to view related content.