Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

CNN-based Fast Split Mode Decision Algorithm for Versatile Video Coding (VVC) Inter Prediction

Authors
Woon-Ha YeoByung-Gyu Kim
Issue Date
Sep-2021
Publisher
한국멀티미디어학회
Keywords
Versatile Video Coding (VVC); Inter Prediction; Fast algorithm; Convolutional Neural Network (CNN); Deep learning.
Citation
Journal of Multimedia Information System, v.8, no.3, pp 147 - 158
Pages
12
Journal Title
Journal of Multimedia Information System
Volume
8
Number
3
Start Page
147
End Page
158
URI
https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/146156
DOI
10.33851/JMIS.2021.8.3.147
ISSN
2383-7632
Abstract
Versatile Video Coding (VVC) is the latest video coding standard developed by Joint Video Exploration Team (JVET). In VVC, the quadtree plus multi-type tree (QT+MTT) structure of coding unit (CU) partition is adopted, and its computational complexity is considerably high due to the brute-force search for recursive rate-distortion (RD) optimization. In this paper, we aim to reduce the time complexity of inter-picture prediction mode since the inter prediction accounts for a large portion of the total encoding time. The problem can be defined as classifying the split mode of each CU. To classify the split mode effectively, a novel convolutional neural network (CNN) called multi-level tree (MLT-CNN) architecture is introduced. For boosting classification performance, we utilize additional information including inter-picture information while training the CNN. The overall algorithm including the MLT-CNN inference process is implemented on VVC Test Model (VTM) 11.0. The CUs of size 128\times128 can be the inputs of the CNN. The sequences are encoded at the random access (RA) configuration with five QP values {22, 27, 32, 37, 42}. The experimental results show that the proposed algorithm can reduce the computational complexity by 11.53% on average, and 26.14% for the maximum with an average 1.01% of the increase in Bjøntegaard delta bit rate (BDBR). Especially, the proposed method shows higher performance on the sequences of the A and B classes, reducing 9.81%~26.14% of encoding time with 0.95%~3.28% of the BDBR increase.
Files in This Item
Go to Link
Appears in
Collections
ETC > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher Kim, Byung Gyu photo

Kim, Byung Gyu
공과대학 (인공지능공학부)
Read more

Altmetrics

Total Views & Downloads

BROWSE