Deep Transformer Based Video Inpainting Using Fast Fourier Tokenization

Kim, Taewan; Kim, Jinwoo; Oh, Heeseok; Kang, Jiwoo

doi:10.1109/ACCESS.2024.3361283

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Deep Transformer Based Video Inpainting Using Fast Fourier Tokenization

Full metadata record

DC Field	Value	Language
dc.contributor.author	Kim, Taewan	-
dc.contributor.author	Kim, Jinwoo	-
dc.contributor.author	Oh, Heeseok	-
dc.contributor.author	Kang, Jiwoo	-
dc.date.accessioned	2024-04-09T02:30:31Z	-
dc.date.available	2024-04-09T02:30:31Z	-
dc.date.issued	2024-02	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/159847	-
dc.description.abstract	Bridging distant space-time interactions is important for high-quality video inpainting with large moving masks. Most existing technologies exploit patch similarities within the frames, or leaverage large-scale training data to fill the hole along spatial and temporal dimensions. Recent works introduce promissing Transformer architecture into deep video inpainting to escape from the dominanace of nearby interactions and achieve superior performance than their baselines. However, such methods still struggle to complete larger holes containing complicated scenes. To alleviate this issue, we first employ a fast Fourier convolutions, which cover the frame-wide receptive field, for token representation. Then, the token passes through the seperated spatio-temporal transformer to explicitly moel the long-range context relations and simultaneously complete the missing regions in all input frames. By formulating video inpainting as a directionless sequence-to-sequence prediction task, our model fills visually consistent content, even under conditions such as large missing areas or complex geometries. Furthermore, our spatio-temporal transformer iteratively fills the hole from the boundary enabling it to exploit rich contextual information. We validate the superiority of the proposed model by using standard stationary masks and more realistic moving object masks. Both qualitative and quantitative results show that our model compares favorably against the state-of-the-art algorithms.	-
dc.format.extent	14	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Deep Transformer Based Video Inpainting Using Fast Fourier Tokenization	-
dc.type	Article	-
dc.publisher.location	미국	-
dc.identifier.doi	10.1109/ACCESS.2024.3361283	-
dc.identifier.scopusid	2-s2.0-85184326166	-
dc.identifier.wosid	001163607000001	-
dc.identifier.bibliographicCitation	IEEE ACCESS, v.12, pp 21723 - 21736	-
dc.citation.title	IEEE ACCESS	-
dc.citation.volume	12	-
dc.citation.startPage	21723	-
dc.citation.endPage	21736	-
dc.type.docType	Article	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalResearchArea	Engineering	-
dc.relation.journalResearchArea	Telecommunications	-
dc.relation.journalWebOfScienceCategory	Computer Science, Information Systems	-
dc.relation.journalWebOfScienceCategory	Engineering, Electrical & Electronic	-
dc.relation.journalWebOfScienceCategory	Telecommunications	-
dc.subject.keywordAuthor	video completion	-
dc.subject.keywordAuthor	free-form inpainting	-
dc.subject.keywordAuthor	object removal	-
dc.subject.keywordAuthor	adversarial learning	-
dc.identifier.url	https://ieeexplore.ieee.org/document/10418237/	-

Files in This Item

2024_02_IEEE Access_Deep_Transformer_based_Video_Inpainting_Using_Fast_Fourier_Tokenization.pdf 5.46 MB

Appears in Collections: ICT융합공학부 > IT공학전공 > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kang, Jiwoo photo

Kang, Jiwoo: 공과대학 (인공지능공학부)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :7,262,915; Today View :6,591

RSS_1.0 RSS_2.0 ATOM_1.0

Sookmyung Women's University. Cheongpa-ro 47-gil 100 (Cheongpa-dong 2ga), Yongsan-gu, Seoul, 04310, Korea02-710-9127

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE