Open Access Open Access  Restricted Access Subscription Access

Part of Speech Tagger for Low Resource Indian Language Using Machine Learning Approach


Affiliations
1 Research Scholar, Department of Computer Science and Applications, DAV University, Jalandhar, India
2 Associate Professor, Department of Computer Science and Applications, DAV University, Jalandhar, India
 

In Language Processing, Part of Speech tagger is one of the fundamental components that are used as a preprocessor for a number of natural language processing tools. For every language before developing the advance tools, POS tagger is developed at the early stage. Various approaches are used for the development of POS tagger. In this research article, a comparative analysis of various Punjabi POS taggers developed by various researchers has been provided and an architecture using an efficient Machine Learning technique is proposed to enhance the accuracy of POS tagger. As all the researchers have used their own test data and not all the developed POS taggers are available online, therefore it is not feasible to test all the POS taggers on common test data set. The claimed results show that POS tagger developed using hybrid approach performs better as compare to rule based technique and other statistic techniques like N-gram, bigram and HMM.

Keywords

Ambiguity, Part of Speech, POS, Punjabi, Rule Based Approach, Statistical Approach, Machine Learning, NLP.
User
Notifications
Font Size


  • Part of Speech Tagger for Low Resource Indian Language Using Machine Learning Approach

Abstract Views: 320  |  PDF Views: 0

Authors

Vikas Verma
Research Scholar, Department of Computer Science and Applications, DAV University, Jalandhar, India
S.K. Sharma
Associate Professor, Department of Computer Science and Applications, DAV University, Jalandhar, India

Abstract


In Language Processing, Part of Speech tagger is one of the fundamental components that are used as a preprocessor for a number of natural language processing tools. For every language before developing the advance tools, POS tagger is developed at the early stage. Various approaches are used for the development of POS tagger. In this research article, a comparative analysis of various Punjabi POS taggers developed by various researchers has been provided and an architecture using an efficient Machine Learning technique is proposed to enhance the accuracy of POS tagger. As all the researchers have used their own test data and not all the developed POS taggers are available online, therefore it is not feasible to test all the POS taggers on common test data set. The claimed results show that POS tagger developed using hybrid approach performs better as compare to rule based technique and other statistic techniques like N-gram, bigram and HMM.

Keywords


Ambiguity, Part of Speech, POS, Punjabi, Rule Based Approach, Statistical Approach, Machine Learning, NLP.

References