Concepedia

Abstract

The main factor in article publications is the accuracy of their content for journals. However, a lot of authors are still confused about the right topic. Therefore, there is a need for text mining classification system to simplify the classification process. This system used the Vector Space Model Approach to represent terms in spatial dimensions, with its advantage of working efficiently. The Cosine Similarity Method was used to calculate the similarities between the two documents. The advantage of this method is that, it is not affected by the length of the document, but rather by the value of each document's terms and its low error rate. Based on the performance testing of this method application using the K-Fold Cross Validation technique, with testing of 6 fold for each data randomization set 6 times, against 126 instances of economic article journals, the average accuracy results were 57,79%, precision was 57,79%, and recall was 62,96%. It can be concluded that cosine similarity can classify articles based on titles and abstracts.

References

YearCitations

Page 1