Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach
  • Lee, Seoyoung
  • Yoon, Seobin
  • Lee, Seongbeen
  • Kim, Hyesoo
  • Sim, Joo-yong
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

GUI task automation streamlines repetitive tasks, but existing LLM or VLM-based planner-executor agents suffer from brittle generalization, high latency, and limited long-horizon coherence. Their reliance on single-shot reasoning or static plans makes them fragile under UI changes or complex tasks. Log2Plan addresses these limitations by combining a structured two-level planning framework with a task mining approach over user behavior logs, enabling robust and adaptable GUI automation. Log2Plan constructs high-level plans by mapping user commands to a structured task dictionary, enabling consistent and generalizable automation. To support personalization and reuse, it employs a task mining approach from user behavior logs that identifies user-specific patterns. These high-level plans are then grounded into low-level action sequences by interpreting real-time GUI context, ensuring robust execution across varying interfaces. We evaluated Log2Plan on 200 real-world tasks, demonstrating significant improvements in task success rate and execution time. Notably, it maintains over 60.0% success rate even on long-horizon task sequences, highlighting its robustness in complex, multi-step workflows. © 2025 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

키워드

GUI automationLarge Language ModelsTask MiningTwo-level Planning
제목
Log2Plan: An Adaptive GUI Automation Framework Integrated with Task Mining Approach
저자
Lee, SeoyoungYoon, SeobinLee, SeongbeenKim, HyesooSim, Joo-yong
DOI
10.1145/3746059.3747663
발행일
2025-09
유형
Conference paper
저널명
UIST 2025 - Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology