Summary:
There is a lot of MS real-world data all over the globe - in distributed, heterogeneous data sources. There is no common language between registries or cohorts in regards to content or structure. Most registries are not ready for (instant) collaboration or large-scaling their data for joint, global analyses.
Approach: In order to be able to not only collaborate with other MS data sources in joint (federated) analyses but also with other non-disease specific data, data has to be transformed to a “common language”.
With the MSDA SwitchBox we want to support data custodians in transforming their data with tools that can easily be run locally and make it easier (and possible) to transform data to a globally used output format.
Key features of the MSDA SwitchBox v2023:
- Data is automatically transformed into an OMOP CDM representation.
- It assumes the patient-level data is in the format of a “core dataset” (CDS) or, additionally, “minimal dataset” of an area of interest, before the tool is run.
- We prioritize having it operational for the MSDA Core Dataset (CDS).
- It is developed in a way it can be easily extended when needed (e.g. extension of CDS, other output schema).
- The SwitchBox is a local service of the MSDA, meaning it runs fully in the local environment of the host.
Role of UHasselt:
- Close collaboration with edenceHealth in designing the framework, defining content-related data quality checks
- Ensure that the framework is in line with the overall MSDA infrastructure roadmap
- Develop the ETL specifications (mapping specifications) for the transformation from source to target schema
- Identify limitations of the OMOP CDM (target schema) and develop possible “workarounds” (including the definition of new concept ids)
- Discuss results of adapted OMOP CDM for CDS with the OHDSI community
This project will be performed together with edenceHealth and KU Leuven, and the Switchbox software and ETL are expected to be published by the summer of 2024.