ISSN (Print) : 0974-6846 Indian Journal of Science and Technology, Vol 10(5), DOI: 10.17485/ijst/2017/v10i5/103233, February 2017 ISSN (Online) : 0974-5645 Classification of Gujarati Documents using Naïve Bayes Classifier Rajnish M. Rakholia1* and Jatinderkumar R. Saini2 1School of Computer Science, R. K. University, Rajkot - 360020, Gujarat, India;
[email protected] 2Narmada College of Computer Application, Bharuch - 392011, Gujarat, India;
[email protected] Abstract Objectives: Information overload on the web is a major problem faced by institutions and businesses today. Sorting out some useful documents from the web which is written in Indian language is a challenging task due to its morphological variance Methods: Keyword search is a one of the way to retrieve the meaningful document from the web, but it doesn’t discriminate by context. In this paper and language barrier. As on date, there is no document classifier available for Gujarati language. we have presented the Naïve Bayes (NB) statistical machine learning algorithm for classification of Gujarati documents. Six pre-defined categories sports, health, entertainment, business, astrologyFindings: and spiritual The experimental are used for this results work. show A corpus that the of 280 Gujarat documents for each category is used for training and testing purpose of the categorizer. WeThese have results used k-foldprove cross validation to evaluate the performance of Naïve Bayes classifier. Applications: Proposed research work isaccuracy very useful of NB to classifier implement without the functionality and using features of directory selection search was in many75.74% web and portals 88.96% to sortrespectively. useful documents and many Informationthat the NB classifierRetrieval contribute(IR) applications.