Information system for structured documents OCR quality control
Informacionnye tehnologii i vyčislitelnye sistemy, no. 2 (2018), pp. 94-102.

Voir la notice de l'article provenant de la source Math-Net.Ru

To date, the computational experiment remains a daily routine procedure during development of machine learning (ML) based software, such as optical character recognition (OCR). Well-known approach of «continuous integration» (CI) is a natural choice for the development of ML software. CI involves frequent centralized program builds and execution of bench tests. This generates a large amount of test results, which should be readily available to developers for error analysis and software version comparison. This article suggests the architecture of the automatic quality control system for the structured documents OCR, including collection, storage and display of bench test results. The results of all software tests are loaded into the database. Builds and bench tests can execute on virtual servers running various operating systems (OS). For stability, the web-server and database use different hardware from the build server. Web technologies are used both for automatic uploading of test results to the database and for servicing user queries.
Mots-clés : computer experiment, machine learning, data processing, web applications, regression testing, continuous integration, quality control.
@article{ITVS_2018_2_a7,
     author = {P. V. Bezmaternyh and E. L. Pliskin and V. V. Farsobina},
     title = {Information system for structured documents {OCR} quality control},
     journal = {Informacionnye tehnologii i vy\v{c}islitelnye sistemy},
     pages = {94--102},
     publisher = {mathdoc},
     number = {2},
     year = {2018},
     language = {ru},
     url = {https://geodesic-test.mathdoc.fr/item/ITVS_2018_2_a7/}
}
TY  - JOUR
AU  - P. V. Bezmaternyh
AU  - E. L. Pliskin
AU  - V. V. Farsobina
TI  - Information system for structured documents OCR quality control
JO  - Informacionnye tehnologii i vyčislitelnye sistemy
PY  - 2018
SP  - 94
EP  - 102
IS  - 2
PB  - mathdoc
UR  - https://geodesic-test.mathdoc.fr/item/ITVS_2018_2_a7/
LA  - ru
ID  - ITVS_2018_2_a7
ER  - 
%0 Journal Article
%A P. V. Bezmaternyh
%A E. L. Pliskin
%A V. V. Farsobina
%T Information system for structured documents OCR quality control
%J Informacionnye tehnologii i vyčislitelnye sistemy
%D 2018
%P 94-102
%N 2
%I mathdoc
%U https://geodesic-test.mathdoc.fr/item/ITVS_2018_2_a7/
%G ru
%F ITVS_2018_2_a7
P. V. Bezmaternyh; E. L. Pliskin; V. V. Farsobina. Information system for structured documents OCR quality control. Informacionnye tehnologii i vyčislitelnye sistemy, no. 2 (2018), pp. 94-102. https://geodesic-test.mathdoc.fr/item/ITVS_2018_2_a7/