In recent years, with the development of artificial intelligence (AI) technology and the development of an environment for collecting massive datasets, AI is becoming increasingly important in fields including basic science, marketing, logistics, finance and medicine. In medicine, progressively, AI technology is being used to analyze medical images, such as functional magnetic resonance images (fMRI), to help diagnose diseases.
On the other hand, reproducibility is essential for practical application of this technology. For example, results obtained by applying AI technology to MRI data from dozens of participants collected at a single site cannot be reproduced in data collected at other sites. In addition, data collected at different sites reflect differences in MRI hardware, protocols, and measurement biases.
Thus, it is not possible to eliminate site differences simply by assembling large amounts of data collected at multiple sites. To resolve these problems, large amounts of data collected from patients with various diseases, using a common imaging protocol at multiple sites, and data measured on the same participants at multiple sites (“traveling subjects”) are required. Until now, however, there has been no public fMRI dataset that satisfies these requirements.
In the current study, published in the journal, Scientific Data , fMRI data of multiple psychiatric or neurological diseases (autism spectrum disorders, major depression, bipolar disorder, schizophrenia, obsessive-compulsive disorder, chronic pain, and stroke) measured with a unified imaging protocol at 14 sites were compiled as a multi-site, multi-disease database (the “SRPBS database”). This database comprises 2,414 samples (993 patients and 1,421 healthy individuals) of resting-state fMRI (rs-fMRI) data, structural MRI data, and demographic data (gender, age, handedness, diagnosis, clinical rating scale). To minimize inter-site differences, “traveling-subject” data measured on nine subjects in 143 sessions at 12 facilities were compiled into a single database.
We have published four datasets generated from the SRPBS database for different purposes:
- The SRPBS Multi-disorder Connectivity Dataset consists of functional connectivity data from patients and healthy participants.
- The SRPBS Multi-disorder MRI Dataset (restricted) consists of rs-fMRI and structural MRI images of patients and healthy participants.
- The SRPBS Multi-disorder MRI Dataset (unrestricted) consists of rs-fMRI and structural MRI images of patients and healthy participants.
- The SRPBS Traveling Subject MRI Dataset consists of rs-fMRI and structural MRI images of traveling-subjects.
By publishing data from multiple sites and multiple diseases, captured with a unified protocol, together with data on traveling subjects, it was possible to harmonize data to minimize differences between sites, and to apply AI technology. Researchers will have global access to these data, and the speed of research may be dramatically increased through development of more accurate diagnostic markers and more advanced harmonization methods.