DeepFake detection algorithm based on improved vision transformer

Heo, Young-Jin; Yeo, Woon-Ha; Kim, Byung-Gyu

doi:10.1007/s10489-022-03867-9

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

DeepFake detection algorithm based on improved vision transformer

Full metadata record

DC Field	Value	Language
dc.contributor.author	Heo, Young-Jin	-
dc.contributor.author	Yeo, Woon-Ha	-
dc.contributor.author	Kim, Byung-Gyu	-
dc.date.accessioned	2023-11-08T06:45:51Z	-
dc.date.available	2023-11-08T06:45:51Z	-
dc.date.issued	2023-04-01	-
dc.identifier.issn	0924-669X	-
dc.identifier.issn	1573-7497	-
dc.identifier.uri	https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/151897	-
dc.description.abstract	A DeepFake is a manipulated video made with generative deep learning technologies, such as generative adversarial networks or auto encoders that anyone can utilize. With the increase in DeepFakes, classifiers consisting of convolutional neural networks (CNN) that can distinguish them have been actively created. However, CNNs have a problem with overfitting and cannot consider the relation between local regions as global feature of image, resulting in misclassification. In this paper, we propose an efficient vision transformer model for DeepFake detection to extract both local and global features. We combine vector-concatenated CNN feature and patch-based positioning to interact with all positions to specify the artifact region. For the distillation token, the logit is trained using binary cross entropy through the sigmoid function. By adding this distillation, the proposed model is generalized to improve performance. From experiments, the proposed model outperforms the SOTA model by 0.006 AUC and 0.013 f1 score on the DFDC test dataset. For 2,500 fake videos, the proposed model correctly predicts 2,313 as fake, whereas the SOTA model predicts 2,276 in the best performance. With the ensemble method, the proposed model outperformed the SOTA model by 0.01 AUC. For Celeb-DF (v2) dataset, the proposed model achieves a high performance of 0.993 AUC and 0.978 f1 score, respectively.	-
dc.format.extent	16	-
dc.language	영어	-
dc.language.iso	ENG	-
dc.publisher	SPRINGER	-
dc.title	DeepFake detection algorithm based on improved vision transformer	-
dc.type	Article	-
dc.publisher.location	네델란드	-
dc.identifier.doi	10.1007/s10489-022-03867-9	-
dc.identifier.scopusid	2-s2.0-85134994902	-
dc.identifier.wosid	000830340500005	-
dc.identifier.bibliographicCitation	APPLIED INTELLIGENCE, v.53, no.7, pp 7512 - 7527	-
dc.citation.title	APPLIED INTELLIGENCE	-
dc.citation.volume	53	-
dc.citation.number	7	-
dc.citation.startPage	7512	-
dc.citation.endPage	7527	-
dc.type.docType	Article; Early Access	-
dc.description.isOpenAccess	N	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.relation.journalResearchArea	Computer Science	-
dc.relation.journalWebOfScienceCategory	Computer Science, Artificial Intelligence	-
dc.subject.keywordAuthor	Deep learning	-
dc.subject.keywordAuthor	Deepfake detection	-
dc.subject.keywordAuthor	Distillation	-
dc.subject.keywordAuthor	Generative adversarial network	-
dc.subject.keywordAuthor	Vision transformer	-

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > 1. Journal Articles

Show simple item record

qrcode

Related Researcher

Researcher Kim, Byung Gyu photo

Kim, Byung Gyu: 공과대학 (인공지능공학부)

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :7,242,395; Today View :4,001

RSS_1.0 RSS_2.0 ATOM_1.0

Sookmyung Women's University. Cheongpa-ro 47-gil 100 (Cheongpa-dong 2ga), Yongsan-gu, Seoul, 04310, Korea02-710-9127

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE