Botswana De-Identification
The National Data Warehouse (NDW) is a repository of both EMR and data abstracted from non-electronic patient-level data in Botswana. The country has two main EMR systems in use within public facilities. The Patient Information Management System (PIMS) is a standalone application that runs through a client-server architecture setup within a health facility and is currently installed in 493 facilities. The other is the Integrated Patient Management System (IPMS), a proprietary centralized web-based system hosted within the Government Domain Network infrastructure; it is used in 28 district hospitals and 18 high-volume clinics with maternity services. While the setup in itself is not a problem, the absence of unique identifiers posed challenges in linking patient records during the development of the analytic dataset.
University of Maryland Baltimore (UMB), in collaboration with the Ministry of Health, developed an analytic dataset through the triangulation of clinical, demographic, and laboratory datasets transmitted to the NDW. Record linkage, matching, and deduplication were conducted using SQL Server Integrated Services (SSIS) and STATA software. Matching and deduplication were performed using probabilistic matching algorithms. The variables used for the deduplication process included first and last name, date of birth, sex, and valid government-issued identification (Omang/Passport number), if available. Deduplicated, unique, and anonymized individual records for every case were used in the final analytic data set. UMB developed codes incorporated in the MOH server to automate the generation of the analytic datasets on monthly basis.
The analytic dataset is used to provide country estimates of people living with HIV and associated indicators such as HIV testing, linkage to treatment, and viral load suppression. More granular analyses are performed from the analytic dataset, such as exploring who and where are the people not linked in care, not achieving viral load suppression, or lost to follow up. This information is fed back to healthcare facilities to improve patient care. Currently, the analytic dataset is automated. The automation process increased efficiency by reducing coding errors and the amount of time that would be spent on manual processing.
UMB provided technical assistance in the development of the dashboard to track data quality improvement, including the completeness of the variables. Below is a screenshot of completeness for selected variables by healthcare facilities:
Green: Excellent performance
Yellow: Good performance
Red: Fair performance needing improvement