Efficient Evaluation of Partial Match Queries for XML Documents Using Information Retrieval Techniques
DC Field | Value | Language |
---|---|---|
dc.contributor.author | 박영호 | - |
dc.contributor.author | Kyu-Young Whang | - |
dc.contributor.author | Byung Suk Lee | - |
dc.contributor.author | Wook-Shin Han | - |
dc.date.accessioned | 2022-04-19T11:43:19Z | - |
dc.date.available | 2022-04-19T11:43:19Z | - |
dc.date.issued | 2005-04 | - |
dc.identifier.issn | 0302-9743 | - |
dc.identifier.issn | 1611-3349 | - |
dc.identifier.uri | https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/148791 | - |
dc.description.abstract | We propose XIR, a novel method for processing partial match queries on heterogeneous XML documents using information retrieval (IR) techniques. A partial match query is defined as the one having the descendent-or-self axis "//" in its path expression. In its general form, a partial match query has branch predicates forming branching paths. The objective of XIR is to efficiently support this type of queries for large-scale documents of heterogeneous schemas. XIR has its basis on the conventional schema-level methods using relational tables and significantly improves their efficiency using two techniques: an inverted index technique and a novel prefix match join. The former indexes the labels in label paths as keywords in texts, and allows for finding the label paths matching the queries more efficiently than string match used in the conventional methods. The latter supports branching path expressions, and allows for finding the result nodes more efficiently than containment joins used in the conventional methods. We compare the efficiency of XIR with those of XRel and XParent using XML documents crawled from the Internet. The results show that XIR is more efficient than both XRel and XParent by several orders of magnitude for linear path expressions, and by several factors for branching path expressions. | - |
dc.format.extent | 18 | - |
dc.language | 영어 | - |
dc.language.iso | ENG | - |
dc.publisher | Springer Verlag | - |
dc.title | Efficient Evaluation of Partial Match Queries for XML Documents Using Information Retrieval Techniques | - |
dc.type | Article | - |
dc.publisher.location | 미국 | - |
dc.identifier.doi | 10.1007/11408079_11 | - |
dc.identifier.scopusid | 2-s2.0-24644434504 | - |
dc.identifier.wosid | 000229213600008 | - |
dc.identifier.bibliographicCitation | Lecture Notes in Computer Science, v.3453, pp 95 - 112 | - |
dc.citation.title | Lecture Notes in Computer Science | - |
dc.citation.volume | 3453 | - |
dc.citation.startPage | 95 | - |
dc.citation.endPage | 112 | - |
dc.description.isOpenAccess | N | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Computer Science | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Information Systems | - |
dc.relation.journalWebOfScienceCategory | Computer Science, Theory & Methods | - |
dc.subject.keywordAuthor | Query Processing | - |
dc.subject.keywordAuthor | Inverted Index | - |
dc.subject.keywordAuthor | Query Pattern | - |
dc.subject.keywordAuthor | Path Expression | - |
dc.subject.keywordAuthor | Very Large Data Base | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
Sookmyung Women's University. Cheongpa-ro 47-gil 100 (Cheongpa-dong 2ga), Yongsan-gu, Seoul, 04310, Korea02-710-9127
Copyright©Sookmyung Women's University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.