상세 보기
- 이채민;
- 곽수영;
- 조선영
WEB OF SCIENCE
0SCOPUS
0초록
Multimodal Emotion Recognition (MER) achieves precise emotion prediction by leveraging diverse modalities suchas audio, visual, and textual cues. However, in real-world scenarios, missing modalities frequently occur due tosensor malfunctions or privacy concerns, leading to significant degradation in model performance. Although variousapproaches have attempted to address this issue by generating missing data or reconstructing latent features fromavailable modalities, most rely on simple statistical mappings or numerical approximations. Consequently, theyoften fail to capture linguistic contexts or complex semantic interactions between modalities. In this paper, wepropose a feature reconstruction framework that leverages the powerful reasoning capabilities of Large LanguageModels (LLM). The proposed model utilizes an LLM as a feature space encoder to semantically complement missingmodalities. Furthermore, it employs a parallel cross-attention mechanism to effectively fuse information acrossdifferent modalities. Extensive experiments demonstrate the validity and effectiveness of our proposed methodunder incomplete data conditions.
키워드
- 제목
- 결측 모달리티를 가진 멀티모달 감정인식을 위한 LLM 기반 의미적 특징 복원
- 제목 (타언어)
- LLM-Guided Semantic Feature Reconstruction for Multimodal Emotion Recognition with Missing Modalities
- 저자
- 이채민; 곽수영; 조선영
- 발행일
- 2026-03
- 유형
- Y
- 저널명
- 전기전자학회논문지
- 권
- 30
- 호
- 1
- 페이지
- 114 ~ 125