Analysis of machine learning models by solving the text data classification problem
Journal of computational and engineering mathematics, Tome 8 (2021) no. 2, pp. 33-45.

Voir la notice de l'article provenant de la source Math-Net.Ru

The article presents a study of usage of machine-learning models for the classification of text data on the example of the problem of classifying requests to technical support through a chat bot of a mobile application. The following methods were considered: Naive Bayes classifier, K-Nearest Neighbors algorithm (KNN)), Decision Tree, Random Forest, Support Vector Machines (SVM) and the method of Logistic Regression (Logistic Regression), as well as 21 models based on above methods. The best machine-learning model for classifying text requests to the technical support chat bot turned out to be a model, based on the Logistic Regression method, and model, based on the Random Forest Classifier. The Complement Naive Bayes model of the Naive Bayes group of models showed the shortest tuning time among the trained models with an acceptable accuracy. The proposed methodology can be used to analyze and classify text data.
Mots-clés : text classification, machine learning methods, regression, natural language, text data analysis.
@article{JCEM_2021_8_2_a2,
     author = {A. V. Pchelin and N. A. Kononov and V. S. Serova and E. V. Bunova and A. D. Marchenko and A. E. Shevchenko},
     title = {Analysis of machine learning models by solving the text data classification problem},
     journal = {Journal of computational and engineering mathematics},
     pages = {33--45},
     publisher = {mathdoc},
     volume = {8},
     number = {2},
     year = {2021},
     language = {en},
     url = {https://geodesic-test.mathdoc.fr/item/JCEM_2021_8_2_a2/}
}
TY  - JOUR
AU  - A. V. Pchelin
AU  - N. A. Kononov
AU  - V. S. Serova
AU  - E. V. Bunova
AU  - A. D. Marchenko
AU  - A. E. Shevchenko
TI  - Analysis of machine learning models by solving the text data classification problem
JO  - Journal of computational and engineering mathematics
PY  - 2021
SP  - 33
EP  - 45
VL  - 8
IS  - 2
PB  - mathdoc
UR  - https://geodesic-test.mathdoc.fr/item/JCEM_2021_8_2_a2/
LA  - en
ID  - JCEM_2021_8_2_a2
ER  - 
%0 Journal Article
%A A. V. Pchelin
%A N. A. Kononov
%A V. S. Serova
%A E. V. Bunova
%A A. D. Marchenko
%A A. E. Shevchenko
%T Analysis of machine learning models by solving the text data classification problem
%J Journal of computational and engineering mathematics
%D 2021
%P 33-45
%V 8
%N 2
%I mathdoc
%U https://geodesic-test.mathdoc.fr/item/JCEM_2021_8_2_a2/
%G en
%F JCEM_2021_8_2_a2
A. V. Pchelin; N. A. Kononov; V. S. Serova; E. V. Bunova; A. D. Marchenko; A. E. Shevchenko. Analysis of machine learning models by solving the text data classification problem. Journal of computational and engineering mathematics, Tome 8 (2021) no. 2, pp. 33-45. https://geodesic-test.mathdoc.fr/item/JCEM_2021_8_2_a2/