정보 검색 기술을 이용한 대규모 이질적인 XML 문서에 대한 효율적인 선형 경로 질의 처리
Efficient Linear Path Query Processing using Information Retrieval Techniques for Large-Scale HeterogeneousXML Documents
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

We propose XIR-Linear, a novel method for processing partial match queries on large-scale heterogeneous XML documents using information retrieval (IR) techniques. XPath queries are written in path expressions on a tree structure representing an XML document. An XPath query in its major form is a partial match query. The objective of XIR-Linear is to efficiently support this type of queries for large-scale documents of heterogeneous schemas. XIR-Linear has its basis on the schema-level methods using relational tables and drastically improves their efficiency and scalability using an inverted index technique. The method indexes the labels in label paths as keywords in texts, and allows for finding the label paths that match the queries far more efficiently than string match used in conventional methods. We demonstrate the efficiency and scalability of XIR-Linear by comparing it with XRel and XParent using XML documents crawled from the Internet. The results show that XIR-Linear is more efficient than both XRel and XParent by several orders of magnitude for linear path expressions as the number of XML documents increases.

제목
정보 검색 기술을 이용한 대규모 이질적인 XML 문서에 대한 효율적인 선형 경로 질의 처리
제목 (타언어)
Efficient Linear Path Query Processing using Information Retrieval Techniques for Large-Scale HeterogeneousXML Documents
저자
한욱신황규영박영호
발행일
2004-10
저널명
정보과학회논문지 : 데이타베이스
31
5
페이지
540 ~ 552