Automatic patent document summarization for collaborative knowledge systems and services

Amy J.C. Trappey, Charles V. Trappey, Chun Yi Wu

Research output: Contribution to journalArticlepeer-review

40 Scopus citations

Abstract

Engineering and research teams often develop new products and technologies by referring to inventions described in patent databases. Efficient patent analysis builds R&D knowledge, reduces new product development time, increases market success, and reduces potential patent infringement. Thus, it is beneficial to automatically and systematically extract information from patent documents in order to improve knowledge sharing and collaboration among R&D team members. In this research, patents are summarized using a combined ontology based and TF-IDF concept clustering approach. The ontology captures the general knowledge and core meaning of patents in a given domain. Then, the proposed methodology extracts, clusters, and integrates the content of a patent to derive a summary and a cluster tree diagram of key terms. Patents from the International Patent Classification (IPC) codes B25C, B25D, B25F (categories for power hand tools) and B24B, C09G and H011 (categories for chemical mechanical polishing) are used as case studies to evaluate the compression ratio, retention ratio, and classification accuracy of the summarization results. The evaluation uses statistics to represent the summary generation and its compression ratio, the ontology based keyword extraction retention ratio, and the summary classification accuracy. The results show that the ontology based approach yields about the same compression ratio as previous non-ontology based research but yields on average an 11% improvement for the retention ratio and a 14% improvement for classification accuracy.

Original languageEnglish
Pages (from-to)71-94
Number of pages24
JournalJournal of Systems Science and Systems Engineering
Volume18
Issue number1
DOIs
StatePublished - 1 Mar 2009

Keywords

  • Document summarization
  • Key phrase extraction
  • Patent document analysis
  • Semantic knowledge service
  • Text mining

Fingerprint Dive into the research topics of 'Automatic patent document summarization for collaborative knowledge systems and services'. Together they form a unique fingerprint.

Cite this