PARALLEL DECISION TREE ALGORITHM FOR MULTITEXT CLASSIFICATION BASED ON SPARK

P. TAMILSELVAN; DR. S. M. JAGATHEESAN

doi:10.5281/ijrset.v7i9.441

pdf

Published Sep 25, 2020

DOI https://doi.org/10.5281/ijrset.v7i9.441

P. TAMILSELVAN

SCHOLAR, PG & RESEARCH DEPARTMENT OF COMPUTER SCIENCE, GOBI ARTS AND SCIENCE COLLEGE (AUTONOMOUS), GOBICHETTIPALAYAM, Erode Dt., TAMILNADU 638453.

DR. S. M. JAGATHEESAN

ASSOCIATE PROFESSOR, PG & RESEARCH DEPARTMENT OF COMPUTER SCIENCE, GOBI ARTS AND SCIENCE COLLEGE (AUTONOMOUS), GOBICHETTIPALAYAM, Erode Dt., TAMILNADU 638453.

Abstract

One of the most challenging issues in the big data research area is the inability to process a large volume of information in a reasonable time. Hadoop and Spark are frameworks for distributed information processing. Hadoop is a very famous and standard platform for massive facts processing. Because of the in-memory programming version, Spark as an open-supply framework is suitable for processing iterative algorithms. With the rapid growth of data amount and feature space dimension under the background of big data, the parallelization of traditional multitext classification algorithms will significantly improve its running efficiency. In this paper, Spark frameworks, the big data distributed processing platforms, are evaluated and compared in terms of Precision, Accuracy and Recall. Hence, the parallel j48 pruned decision tree classification algorithm is implemented on datasets with different sizes within Spark. The results show that the runtime of the parallel j48 pruned decision tree classification algorithm implemented on Spark is faster than Hadoop. Evaluations show that Hadoop makes use of greater sources, such as crucial processor and network. It is concluded that the Spark is more effective than Hadoop.

Issue

Vol 7 No 9 (2020): Volume 7 Issue 9

Section

Articles

PARALLEL DECISION TREE ALGORITHM FOR MULTITEXT CLASSIFICATION BASED ON SPARK

##plugins.themes.bootstrap3.article.sidebar##

##plugins.themes.bootstrap3.article.main##

Abstract

##plugins.themes.bootstrap3.article.details##