Detailed Information

Cited 4 time in webofscience Cited 5 time in scopus
Metadata Downloads

Scale-Aware Visual-Inertial Depth Estimation and Odometry Using Monocular Self-Supervised Learningopen access

Authors
Lee, ChungkeunKim, ChanghyeonKim, PyojinLee, HyeonbeomKim, H. Jin
Issue Date
Mar-2023
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Odometry; Deep learning; Loss measurement; Depth measurement; Cameras; Self-supervised learning; Coordinate measuring machines; monocular depth estimation; self-supervised learning; visual-inertial odometry
Citation
IEEE ACCESS, v.11, pp 24087 - 24102
Pages
16
Journal Title
IEEE ACCESS
Volume
11
Start Page
24087
End Page
24102
URI
https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/152012
DOI
10.1109/ACCESS.2023.3252884
ISSN
2169-3536
Abstract
For real-world applications with a single monocular camera, scale ambiguity is an important issue. Because self-supervised data-driven approaches that do not require additional data containing scale information cannot avoid the scale ambiguity, state-of-the-art deep-learning-based methods address this issue by learning the scale information from additional sensor measurements. In that regard, inertial measurement unit (IMU) is a popular sensor for various mobile platforms due to its lightweight and inexpensiveness. However, unlike supervised learning that can learn the scale from the ground-truth information, learning the scale from IMU is challenging in a self-supervised setting. We propose a scale-aware monocular visual-inertial depth estimation and odometry method with end-to-end training. To learn the scale from the IMU measurements with end-to-end training in the monocular self-supervised setup, we propose a new loss function named as preintegration loss function, which trains scale-aware ego-motion by comparing the ego-motion integrated from IMU measurement and predicted ego-motion. Since the gravity and the bias should be compensated to obtain the ego-motion by integrating IMU measurements, we design a network to predict the gravity and the bias in addition to the ego-motion and the depth map. The overall performance of the proposed method is compared to state-of-the-art methods in the popular outdoor driving dataset, i.e., KITTI dataset, and the author-collected indoor driving dataset. In the KITTI dataset, the proposed method shows competitive performance compared with state-of-the-art monocular depth estimation and odometry methods, i.e., root-mean-square error of 5.435 m in the KITTI Eigen split and absolute trajectory error of 22.46 m and 0.2975 degrees in the KITTI odometry 09 sequence. Different from other up-to-scale monocular methods, the proposed method can estimate the metric-scaled depth and camera poses. Additional experiments on the author-collected indoor driving dataset qualitatively confirm the accurate performance of metric-depth and metric pose estimations.
Files in This Item
Go to Link
Appears in
Collections
Engineering > Department of Mechanical Systems Enginee > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetrics

Total Views & Downloads

BROWSE