Abstract:
Many countries apply data science techniques to enhance their health sectors and the
surveillance of diseases. The success of the innovations lies on the availability and quality of
datasets to be analyzed. In Tanzania, while different Hospital Management Information Systems
(HoMIS) like the Government of Tanzania Hospital Management Information System (GoTHoMIS) are installed in various hospitals, the data stored in the systems are not integrated. This
causes unavailability of high quality, timely, anonymous, harmonized, and integrated datasets
that can be shared and exhaustively analyzed for epidemic diseases surveillance. This study
intended to develop a data warehouse to host patients’ demographic and clinical particulars
essential for epidemic diseases surveillance from a multi-node GoT-HoMIS, and yield an
integrated dataset that can be used for epidemic diseases surveillance.
Interviews were conducted in three strategic health facilities and the Ministry responsible for
Health in Tanzania. Documents were reviewed, and observation done on the patient’s
registration process in the GoT-HoMIS. Thereafter, a data warehouse was developed to run
under MariaDB database server, and using Hypertext Preprocessor an Extract, Transform, and
Load (ETL) module was developed. The ETL module was deployed at six health facilities, and
the resulting integrated dataset of 152 104 facts was visualized by using FusionCharts libraries.
The study demonstrates a novel means to extract data straight from the GoT-HoMIS nodes,
which has the potential to make available and provide timely data and integrated reports for
decision-making on epidemics. By scaling the innovation to other health facilities, epidemics
surveillance can be significantly enhanced.