Skip-Gram-KR: Korean Word Embedding for Semantic Clustering
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ihm, Sun-Young | - |
dc.contributor.author | Lee, Ji-Hye | - |
dc.contributor.author | Park, Young-Ho | - |
dc.date.available | 2021-02-22T06:45:46Z | - |
dc.date.issued | 2019-03 | - |
dc.identifier.issn | 2169-3536 | - |
dc.identifier.uri | https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/3739 | - |
dc.description.abstract | Deep learning algorithms are used in various applications for pattern recognition, natural language processing, speech recognition, and so on. Recently, neural network-based natural language processing techniques use fixed length word embedding. Word embedding is a method of digitizing a word at a specific position into a low-dimensional dense vector with fixed length while preserving the similarity of the distribution of its surrounding words. Currently, the word embedding methods for foreign language are used for Korean words; however, existing word embedding methods are developed for English originally, so they do not reflect the order and structure of the Korean words. In this paper, we propose a word embedding method for Korean, which is called Skip-gram-KR, and a Korean affix tokenizer. Skip-gram-KR creates similar word training data through backward mapping and the two-word skipping method. The experiment results show the proposed method achieved the most accurate performance. | - |
dc.format.extent | 14 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
dc.title | Skip-Gram-KR: Korean Word Embedding for Semantic Clustering | - |
dc.type | Article | - |
dc.publisher.location | 미국 | - |
dc.identifier.doi | 10.1109/ACCESS.2019.2905252 | - |
dc.identifier.scopusid | 2-s2.0-85065244586 | - |
dc.identifier.wosid | 000463942200001 | - |
dc.identifier.bibliographicCitation | IEEE ACCESS, v.7, pp 39948 - 39961 | - |
dc.citation.title | IEEE ACCESS | - |
dc.citation.volume | 7 | - |
dc.citation.startPage | 39948 | - |
dc.citation.endPage | 39961 | - |
dc.type.docType | Article | - |
dc.description.isOpenAccess | Y | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalResearchArea | Engineering | - |
dc.relation.journalResearchArea | Telecommunications | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.relation.journalWebOfScienceCategory | Engineering, Electrical & Electronic | - |
dc.relation.journalWebOfScienceCategory | Telecommunications | - |
dc.subject.keywordAuthor | Word embedding | - |
dc.subject.keywordAuthor | natural language processing | - |
dc.subject.keywordAuthor | Korean word embedding | - |
dc.subject.keywordAuthor | text mining | - |
dc.subject.keywordAuthor | deep learning | - |
dc.subject.keywordAuthor | semantic clustering | - |
dc.subject.keywordAuthor | machine learning | - |
dc.identifier.url | https://ieeexplore.ieee.org/document/8667838 | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Sookmyung Women's University. Cheongpa-ro 47-gil 100 (Cheongpa-dong 2ga), Yongsan-gu, Seoul, 04310, Korea02-710-9127
Copyright©Sookmyung Women's University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.