Support Vector Machine versus k-Nearest Neighbor for Arabic Text Classification

Eman Al-Thwaib; Waseem Al-Romimah

Support Vector Machine versus k-Nearest Neighbor for Arabic Text Classification

Loading document ...

Page

Loading page ...

Author(s)

Author(s): Eman Al-Thwaib, Waseem Al-Romimah

Download Full PDF Read Complete Article

~ 476 ` 1249 a 1-5 Volume 3 - Jun 2014

Abstract

Text Classification (TC) or text categorization can be described as the act of assigning text documents to predefined classes or categories. The need for automatic text classification came from the large amount of electronic documents on the web. The classification accuracy is affected by the documents content and the classification technique being used. In this research, an automatic Support Vector Machine (SVM) and k-Nearest Neighbor (kNN) classifiers will be developed and compared in classifying 800 Arabic documents into four categories (sport, politics, religion, and economy). The experimental results are presented in terms of F1-measure, precision, and recall.

Keywords

Text Classification, Machine Learning, Support Vector Machine, k-Nearest Neighbor

References

S. Al-Saleem, â€œAutomated Arabic Text Categorization Using SVM and NBâ€, International Arab Journal of e-Technology, Vol. 2, No. 2, June 2011
M. Abdelwadood, â€œSupport Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Studyâ€, 12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, p.p 29-31, 2007
K. Al-Hindi, E. Al-Thwaib, â€œA Comparative Study of Machine Learning Techniques in Classifying Full-Text Arabic Documents versus Summarized Documentsâ€, World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741, Vol. 2, No. 7, p.p 126-129, 2013
L. Khreisat, â€œArabic text classification using N-Gram frequency statistics, a comparative studyâ€, Proceedings of the international conference on data mining (DMIN2006), Las Vegas, USA, p.p 78-82, 2006
M. El-Kourdi, A. Bensaid, and T. Rachidi â€œAutomatic Arabic documents categorization based on the NaÃ¯ve Bayes algorithmâ€, In proceedings of the workshop on computational approaches to Arabic script-based languages (COLING-2004), University of Geneva, Geneva, Switzerland, p.p 51-58, 2004
R. Al-Shalabi, G. Kanaan, and M. Gharaibeh, â€œArabic text categorization using kNN algorithmâ€, Proceedings of the 4th international multiconference on computer science and information technology (CSIT 2006), volume 4, Amman, Jordan, 2006
H. Zhang, D. Li, â€œNaÃ¯ve Bayes text classifierâ€, IEEE international conference on granular computing, p.p 708-711, 2007
G. Dayal, â€œKnowledge based Neural Network for text classificationâ€, IEEE international conference on granular computing. dâ€™Analyse statistique des Donnees Textuelles, p.p 542-547, 2007
A. Mesleh, â€œChi Square Feature Extraction Based SVMs Arabic Language Text Categorization Systemâ€, Journal of Computer Science (3:6), pp. 430-435, 2007
V. Springer, V. Vapnik, â€œThe Nature of Statistical Learning Theoryâ€, chapter 5, New York, 1995
T. Joachims, â€œTransductive Inference for Text Classification using Support Vector Machinesâ€, proceedings of the International Conference on Machine Learning (ICML), pp. 200-209, 1999
T. Joachims, â€œText Categorization with Support Vector Machines: Learning with Many Relevant Featuresâ€, In Proceedings of the European Conference on Machine Learning (ECML), pp.173-142, Berlin, 1998
C. Van, Rijsbergan, â€œInformation Retrievalâ€, Buttersmiths, 2nd Edition, 1979
R. Al-Shalabi, G. Kanaan, and M. Gharaibeh â€œArabic Text Categorization Using KNN Algorithmâ€, The 4th International Multiconference on Computer and Information Technology, CSIT 2006, Amman, Jordan, 2006
WEKA. Data Mining Software in Java: http://www.cs.waikato.ac.nz/ml/weka. last visit on May, 2014
B. Al-Shargabi, W. Al-Romimah, and F. Olayah, â€œA Comparative Study for Arabic Text Classification Algorithms Based on Stop Words Eliminationâ€, In proceedings of the International Conference on Intelligent Semantic Web-Services and Applications, 2011

Cite this Article:
Eman Al-Thwaib, Support Vector Machine versus k-Nearest Neighbor for Arabic Text Classification, International Journal of Sciences 06(2014):1-5

International Journal of Sciences is Open Access Journal.
This article is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License.
Author(s) retain the copyrights of this article, though, publication rights are with Alkhaer Publications.

Search Articles

Issue June 2024

Volume 13, June 2024

World-wide Delivery is FREE

Share this Issue with Friends:

News Board

Check Article Status Publishing Period: 2 Days ijSciences is Member of Crossref ijSciences is Member of IDEAS RePEc Assigning DOI to each new article Add colorful images, drawings, tables as much as you like without any extra cost Check DOIs of your References OAI PMH

Submit your Paper

Support Vector Machine versus k-Nearest Neighbor for Arabic Text Classification

Author(s)

Abstract

Keywords

References

Search Articles

Issue June 2024

Volume 13, June 2024

Table of Contents

News Board