Support Vector Machine versus k-Nearest Neighbor for Arabic Text Classification

Support Vector Machine versus k-Nearest Neighbor for Arabic Text Classification

Loading document ...
Page
of
Loading page ...

Author(s)

Author(s): Eman Al-Thwaib, Waseem Al-Romimah

Download Full PDF Read Complete Article

452 1151 1-5 Volume 3 - Jun 2014

Abstract

Text Classification (TC) or text categorization can be described as the act of assigning text documents to predefined classes or categories. The need for automatic text classification came from the large amount of electronic documents on the web. The classification accuracy is affected by the documents content and the classification technique being used. In this research, an automatic Support Vector Machine (SVM) and k-Nearest Neighbor (kNN) classifiers will be developed and compared in classifying 800 Arabic documents into four categories (sport, politics, religion, and economy). The experimental results are presented in terms of F1-measure, precision, and recall.

Keywords

Text Classification, Machine Learning, Support Vector Machine, k-Nearest Neighbor

References

  1. S. Al-Saleem, “Automated Arabic Text Categorization Using SVM and NB”, International Arab Journal of e-Technology, Vol. 2, No. 2, June 2011
  2. M. Abdelwadood, “Support Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Study”, 12th WSEAS Int. Conf. on APPLIED MATHEMATICS, Cairo, Egypt, p.p 29-31, 2007
  3. K. Al-Hindi, E. Al-Thwaib, “A Comparative Study of Machine Learning Techniques in Classifying Full-Text Arabic Documents versus Summarized Documents”, World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741, Vol. 2, No. 7, p.p 126-129, 2013
  4. L. Khreisat, “Arabic text classification using N-Gram frequency statistics, a comparative study”, Proceedings of the international conference on data mining (DMIN2006), Las Vegas, USA, p.p 78-82, 2006
  5. M. El-Kourdi, A. Bensaid, and T. Rachidi “Automatic Arabic documents categorization based on the Naïve Bayes algorithm”, In proceedings of the workshop on computational approaches to Arabic script-based languages (COLING-2004), University of Geneva, Geneva, Switzerland, p.p 51-58, 2004
  6. R. Al-Shalabi, G. Kanaan, and M. Gharaibeh, “Arabic text categorization using kNN algorithm”, Proceedings of the 4th international multiconference on computer science and information technology (CSIT 2006), volume 4, Amman, Jordan, 2006
  7. H. Zhang, D. Li, “Naïve Bayes text classifier”, IEEE international conference on granular computing, p.p 708-711, 2007
  8. G. Dayal, “Knowledge based Neural Network for text classification”, IEEE international conference on granular computing. d’Analyse statistique des Donnees Textuelles, p.p 542-547, 2007
  9. A. Mesleh, “Chi Square Feature Extraction Based SVMs Arabic Language Text Categorization System”, Journal of Computer Science (3:6), pp. 430-435, 2007
  10. V. Springer, V. Vapnik, “The Nature of Statistical Learning Theory”, chapter 5, New York, 1995
  11. T. Joachims, “Transductive Inference for Text Classification using Support Vector Machines”, proceedings of the International Conference on Machine Learning (ICML), pp. 200-209, 1999
  12. T. Joachims, “Text Categorization with Support Vector Machines: Learning with Many Relevant Features”, In Proceedings of the European Conference on Machine Learning (ECML), pp.173-142, Berlin, 1998
  13. C. Van, Rijsbergan, “Information Retrieval”, Buttersmiths, 2nd Edition, 1979
  14. R. Al-Shalabi, G. Kanaan, and M. Gharaibeh “Arabic Text Categorization Using KNN Algorithm”, The 4th International Multiconference on Computer and Information Technology, CSIT 2006, Amman, Jordan, 2006
  15. WEKA. Data Mining Software in Java: http://www.cs.waikato.ac.nz/ml/weka. last visit on May, 2014
  16. B. Al-Shargabi, W. Al-Romimah, and F. Olayah, “A Comparative Study for Arabic Text Classification Algorithms Based on Stop Words Elimination”, In proceedings of the International Conference on Intelligent Semantic Web-Services and Applications, 2011

Cite this Article:

  • BibTex
  • RIS
  • APA
  • Harvard
  • IEEE
  • MLA
  • Vancouver
  • Chicago

International Journal of Sciences is Open Access Journal.
This article is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0) License.
Author(s) retain the copyrights of this article, though, publication rights are with Alkhaer Publications.

Search Articles

Issue June 2019

Volume 8, June 2019


Table of Contents


Order Print Copy

World-wide Delivery is FREE

Share this Issue with Friends:


Submit your Paper