LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

MOFISSLAM: A Multi-Object Semantic SLAM System With Front-View, Inertial, and Surround-View Sensors for Indoor Parking

Photo by saadahmad_umn from unsplash

The semantic SLAM (Simultaneous Localization And Mapping) system is a crucial module for autonomous indoor parking. Visual cameras (monocular/binocular) and IMU (Inertial Measurement Unit) constitute the basic configuration to build… Click to show full abstract

The semantic SLAM (Simultaneous Localization And Mapping) system is a crucial module for autonomous indoor parking. Visual cameras (monocular/binocular) and IMU (Inertial Measurement Unit) constitute the basic configuration to build such a system. The performance of existing SLAM systems typically deteriorates in the presence of dynamically movable objects or objects with little texture. By contrast, semantic objects on the ground embody the most salient and stable features in the indoor parking environment. Due to their inabilities to perceive such features on the ground, existing SLAM systems are prone to tracking inconsistency during navigation. In this paper, we present MOFISSLAM, a novel tightly-coupled ${M}$ ulti- ${O}$ bject semantic SLAM system integrating ${F}$ ront-view, ${I}$ nertial, and ${S}$ urround-view sensors for autonomous indoor parking. The proposed system moves beyond existing semantic SLAM systems by complementing the sensor configuration with a surround-view system capturing images from a top-down viewpoint. In MOFISSLAM, apart from low-level visual features and inertial motion data, typical semantic objects (parking-slots, parking-slot IDs and speed bumps) detected in surround-views are also incorporated in optimization, forming robust surround-view constraints. Specifically, each surround-view feature imposes a surround-view constraint that can be split into a contact term and a registration term. The former pre-defines the position of each individual surround-view feature subject to whether it has semantic contact with other surround-view features. Three contact modes, defined as complementary, adjacent and coincident, are identified to guarantee a unified form of all contact terms. The latter further constrains by registering each surround-view observation and its position in the world coordinate system. In parallel, to objectively evaluate SLAM studies for autonomous indoor parking, a large-scale dataset with groundtruth trajectories is collected, which is the first of its kind. Its groundtruth trajectories, commonly unavailable, are obtained by tracking artificial features scattered in the indoor parking environment, whose 3D coordinates are measured with an ETS (Electronic Total Station). The collected dataset has been made publicly available at https://shaoxuan92.github.io/MOFIS.

Keywords: view; slam; tex math; inline formula; surround view

Journal Title: IEEE Transactions on Circuits and Systems for Video Technology
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.