Abstract I present an analysis of feature selection for automatic patent categorization. For a corpus of 7,309 patent applications from the World Patent Information (WPI) Test Collection (Lupu, 2019), I… Click to show full abstract
Abstract I present an analysis of feature selection for automatic patent categorization. For a corpus of 7,309 patent applications from the World Patent Information (WPI) Test Collection (Lupu, 2019), I assign International Patent Classification (IPC) section codes using a modified Naive Bayes classifier. I compare precision, recall, and f-measure for a variety of meta-parameter settings including data smoothing and acceptance threshold. Finally, I apply the optimized model to IPC class and group codes and compare the results of patent categorization to academic literature.
               
Click one of the above tabs to view related content.