미국 특허 서지정보 추출 방법에 대한 연구: HTML 파싱 기법의 활용을 중심으로An Extraction Method of Bibliographic Information from the US Patents: Using an HTML Parsing Technique
- Other Titles
- An Extraction Method of Bibliographic Information from the US Patents: Using an HTML Parsing Technique
- Authors
- 한유진; 오승우
- Issue Date
- Jun-2010
- Publisher
- 한국정보관리학회
- Keywords
- US patents; bibliographic information; extraction; HTML parsing; 미국 특허; 서지정보; 추출; HTML 파싱
- Citation
- 정보관리학회지, v.27, no.2, pp 7 - 20
- Pages
- 14
- Journal Title
- 정보관리학회지
- Volume
- 27
- Number
- 2
- Start Page
- 7
- End Page
- 20
- URI
- https://scholarworks.sookmyung.ac.kr/handle/2020.sw.sookmyung/13505
- DOI
- 10.3743/KOSIM.2010.27.2.007
- ISSN
- 1013-0799
2586-2073
- Abstract
- This study aims to provide a method of extracting the most recent information on US patent documents. An HTML paring technique that can directly connect to the US Patent and Trademark Office (USPTO) Web page is adopted. After obtaining a list of 50 documents through a keyword searching method, this study suggested an algorithm, using HTML parsing techniques, which can extract a patent number, an applicant, and the US patent class information. The study also revealed an algorithm by which we can extract both patents and subsequent patents using their closely connected relationship, that is a very distinctive characteristic of US patent documents. Although the proposed method has several limitations, it can supplement existing databases effectively in terms of timeliness and comprehensiveness.
- Files in This Item
-
Go to Link
- Appears in
Collections - 글로벌서비스학부 > 앙트러프러너십전공 > 1. Journal Articles
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.