Saturday, 19 de April de 2014

Ficha del recurso:

Fuente:

Vínculo original en INFORMATION PROCESSING & MANAGEMENT 47 (3): 309-322 MAY 2011
Chen, YL; Chiu, YT

Última actualización:

Thursday, 26 de May de 2011

Entrada en el observatorio:

Thursday, 26 de May de 2011

Idioma:

Inglés

Archivado en:


An IPC-based vector space model for patent retrieval

Determining requirements when searching for and retrieving relevant information suited to a user's needs has become increasingly important and difficult, partly due to the explosive growth of electronic documents. The vector space model (VSM) is a popular method in retrieval procedures. However, the weakness in traditional VSM is that the indexing vocabulary changes whenever changes occur in the document set, or the indexing vocabulary selection algorithms, or parameters of the algorithms, or if wording evolution occurs. The major objective of this research is to design a method to solve the afore-mentioned problems for patent retrieval. The proposed method utilizes the special characteristics of the patent documents, the International Patent Classification (IPC) codes, to generate the indexing vocabulary for presenting all the patent documents. The advantage of the generated indexing vocabulary is that it remains unchanged, even if the document sets, se! lection algorithms, and parameters are changed, or if wording evolution occurs. Comparison of the proposed method with two traditional methods (entropy and chi-square) in manual and automatic evaluations is presented to verify the feasibility and validity. The results also indicate that the IPC-based indexing vocabulary selection method achieves a higher accuracy and is more satisfactory. (C) 2010 Elsevier Ltd. All rights reserved.