A Different Text Mining Process for Classifying Journal Databases using Machine Learning Algorithms

Thailambal, G. and Ananthi, Sheshasaayee (2019) A Different Text Mining Process for Classifying Journal Databases using Machine Learning Algorithms. International Journal of Recent Technology and Engineering, 8 (2S11). pp. 239-243. ISSN 2277-3878

[thumbnail of B10390982S1119.pdf] Archive
B10390982S1119.pdf

Download (776kB)

Abstract

A Different Text Mining Process for Classifying Journal Databases using Machine Learning Algorithms

Google is the information repository for the entire world and is an important Search engine used for Information Retrieval. Accessing web pages is getting increased everyday which can be compared to the speed in which light travels. Biggest Challenge is identifying the user interest and providing them information based on the high relevancy. Mostly researchers search journal documents for their research every day. Classifying the content as papers or Slides or thesis is very difficult as the words used in these documents are not semantically checked. To mine the correct content in web page Data Mining is used by most of the researchers. Text Mining is one of its application. Text mining in nutshell is extracting useful information from unstructured data. The proposed Model Author Keyword Weightage in Journal Ranking (AKWJR) is developed to retrieve relevant journals that will help the researchers to identify the relevant documents from the pool of irrelevant documents. In many keyword ranking applications such as RAKE and TEXTRANK author annotated keywords were compared and used for ranking. The assignment of keywords to article by the author is different in their form and perspective. Though they were not choosing the keywords in a controlled vocabulary the keywords were used to describe their own content in the article. Two algorithms were used to arrange the keywords according to topics and the keywords inside the journals will be scored depending on its presence in various fields in the article. Depending on the score the journals will be ranked in such a way that the author can decide whether to open the article for their requirement. This is achieved through Latent Dirichlet Allocation, RankSVM and TF-IDF Algorithms.
11 2 2019 239 243 B10390982S1119 10.35940/ijrte.B1039.0982S1119 https://www.ijrte.org/wp-content/uploads/papers/v8i2S11/B10390982S1119.pdf https://www.ijrte.org/wp-content/uploads/papers/v8i2S11/B10390982S1119.pdf

Item Type: Article
Subjects: Computer Science Engineering > Machine Learning
Divisions: Computer Science
Depositing User: Mr IR Admin
Date Deposited: 10 Oct 2024 09:02
Last Modified: 10 Oct 2024 09:02
URI: https://ir.vistas.ac.in/id/eprint/9681

Actions (login required)

View Item
View Item