Hey, thanks for the response. This is mainly because the model uses uni-grams for training. By this I mean that the model is trained one word per word. So it does not matter what the remaining contents are as the next word will be predicted on the last word. Although, we can use bi-grams and tri-grams to achieve higher accuracies by modifying the dataset. Then you can use the last two words or the last three words for the predictions. Feel free to let me know if you have any other queries. Thank you.

Love to explore and learn new concepts. I am extremely interested in artificial intelligence, deep learning, robots, and the universe.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store