Using text classifier to predict various categories in Malawi News articles using SMOTE and SGDClassifier.
The project code is simple and effective on competitive grounds. I have experimented with Vectorizer, Porter stemmer for test preprocessing. I have also used multiple methods to clean my text to improved overall model performance. In the end, I have used SKlearn Stochastic Gradient Decent (SGD) classifier for predicting News categories. I have also experimented with various neural networks and gradient boosting models, but they all failed as simple logistics regression with minimum hyperparameter tunning works quite well on this data.
To understand the code read my article on Medium