Bangla Word Prediction and Sentence Completion Using GRU: An Extended Version of RNN on N-gram Language Model

Abstract

Textual information exchange, by typing the information and send it to the other end, is one of the most prominent mediums of communication throughout the world. People occupy a lot of time sending emails or additional information on social networking sites where typing the whole information is redundant and time-consuming in this advanced era. To make textual information exchange more speedy and easier, word predictive systems are launched which can predict the next most likely word so that people do not have to type the next word but select it from the suggested words. In this study, we have proposed a method that can predict the next most appropriate and suitable word in Bangla language, and also it can suggest the corresponding sentence to contribute to this technology of word prediction systems. This proposed approach is, using GRU (Gated Recurrent Unit) based RNN (Recurrent Neural Network) on n-gram dataset to create such language models that can predict the word(s) from the input sequence provided. We have used a corpus dataset, collected from different sources in Bangla language to run the experiments. Compared to the other methods that have been used such as LSTM (Long Short Term Memory) based RNN on n-gram dataset and Naïve Bayes with Latent Semantic Analysis, our proposed approach gives better performance. It gives an average accuracy of 99.70% for 5-gram model, 99.24% for 4-gram model, 95.84% for Tri-gram model, 78.15%, and 32.17% respectively for Bi-gram and Uni-gram models on average.

References

Page 1

	Year	Citations

Page 1