상세 보기
- 한욱신;
- 황규영;
- 박영호
WEB OF SCIENCE
0SCOPUS
0초록
We propose XIR-Linear, a novel method for processing partial match queries on large-scale heterogeneous XML documents using information retrieval (IR) techniques. XPath queries are written in path expressions on a tree structure representing an XML document. An XPath query in its major form is a partial match query. The objective of XIR-Linear is to efficiently support this type of queries for large-scale documents of heterogeneous schemas. XIR-Linear has its basis on the schema-level methods using relational tables and drastically improves their efficiency and scalability using an inverted index technique. The method indexes the labels in label paths as keywords in texts, and allows for finding the label paths that match the queries far more efficiently than string match used in conventional methods. We demonstrate the efficiency and scalability of XIR-Linear by comparing it with XRel and XParent using XML documents crawled from the Internet. The results show that XIR-Linear is more efficient than both XRel and XParent by several orders of magnitude for linear path expressions as the number of XML documents increases.
- 제목
- 정보 검색 기술을 이용한 대규모 이질적인 XML 문서에 대한 효율적인 선형 경로 질의 처리
- 제목 (타언어)
- Efficient Linear Path Query Processing using Information Retrieval Techniques for Large-Scale HeterogeneousXML Documents
- 저자
- 한욱신; 황규영; 박영호
- 발행일
- 2004-10
- 저널명
- 정보과학회논문지 : 데이타베이스
- 권
- 31
- 호
- 5
- 페이지
- 540 ~ 552