AI-READI Dataset Overview
Go Back to the PredictMod Main Page.
The AI-READI dataset is a comprehensive collection of data aimed at advancing research on Type 2 Diabetes Mellitus (T2DM) and its progression. This dataset plays a pivotal role in understanding the dynamics of T2DM by integrating clinical, behavioral, environmental, and wearable data sources to enable cutting-edge research in machine learning, AI, and healthcare. The project seeks to not only predict the onset and progression of T2DM but also explore the transition from disease to health through innovative data science methodologies.
Purpose and Goals
The primary goal of the AI-READI project is to provide a rich and diverse dataset that can be used to build predictive models for early detection, management, and potential reversal of Type 2 Diabetes. Researchers leverage this data to explore factors that contribute to T2DM development, monitor its progression, and test interventions for its prevention or reversal. By combining different types of data, AI-READI aims to foster scientific discoveries and AI-driven tools that can transform T2DM care and management.
Dataset Structure
The AI-READI dataset is divided into various data categories, each housed in its own folder for easy access and organization. The data is structured as follows:
- Clinical Data This folder includes participant health data, capturing a wide array of clinical measurements vital for assessing physical and metabolic health. It includes vital signs, lab results, and information on T2DM-related conditions.
- Size: 64.2 MiB
- Environmental Data Environmental factors are key to understanding the broader context of health and disease. This folder includes data on air quality, weather patterns, and environmental variables that could impact the health outcomes of individuals with T2DM.
- Size: 24.6 GiB
- Retinal Imaging Data Retinal imaging plays a crucial role in understanding the vascular complications of diabetes. There are several subfolders dedicated to different types of retinal scans:
- Retinal_flio: Optical coherence tomography angiography scans.
- Retinal_oct: Optical coherence tomography scans.
- Retinal_octa: Advanced optical coherence tomography angiography.
- Retinal_photography: Fundus photography.
- Size: Ranges from 470 GiB to 657 GiB
- Wearable Device Data Wearable devices provide continuous, real-time insights into participant behavior. The dataset includes two categories:
- Wearable_activity_monitor: Data from activity trackers that measure movement, sleep, and physical activity.
- Wearable_blood_glucose: Continuous blood glucose monitoring data.
- Size: Wearable activity (16 GiB), Wearable blood glucose (1.81 GiB)
- Metadata Files Metadata is crucial for understanding the dataset structure, and ensuring the correct use of data. Key metadata files include:
- CHANGES.md: Updates to the dataset over time.
- dataset_description.json: A description of the dataset's content and purpose.
- dataset_structure_description.json: Details on how the dataset is structured.
- participants.json: Participant demographic and health information.
- study_description.json: Detailed explanation of the study methodology.
- README.md: General instructions for using the dataset.
- LICENSE.txt: Licensing and terms of use.
- Size: Varies based on the file (from a few kilobytes to several megabytes)
Dataset Access and Use
The AI-READI dataset is designed to support a wide range of research and development efforts. Some parts of the dataset are publicly available to support broad collaboration and open science, while others require controlled access due to the sensitivity of the data. Researchers interested in accessing the full dataset must adhere to specific data-sharing protocols and licensing agreements to ensure the privacy and integrity of participant information.
Current Research Focus
At present, the AI-READI dataset is being used for:
- Predictive Modeling: Developing machine learning models to predict the likelihood of developing T2DM based on various health indicators.
- Salutogenesis Analysis: Investigating factors that contribute to reversing or preventing the progression of T2DM.
- Multimodal Data Integration: Merging clinical, environmental, and wearable data to create more comprehensive models of diabetes progression.
- Longitudinal Studies: Examining the long-term effects of various health interventions on people with T2DM.
Future Directions
The AI-READI project is continuously evolving with the goal of expanding the dataset and improving its scope. Future updates will include more granular environmental data, genetic data, and additional biometric measurements. This expansion will help refine predictive models and provide deeper insights into the management of T2DM.
Collaboration Opportunities
AI-READI encourages collaboration with healthcare providers, researchers, and technologists who are interested in contributing to the understanding of T2DM through AI, machine learning, and data analysis. The dataset is a valuable resource for anyone working to advance healthcare and improve outcomes for those living with or at risk for T2DM.
Conclusion
The AI-READI dataset is an invaluable resource for anyone looking to advance research in Type 2 Diabetes Mellitus. With its comprehensive and diverse data types, it offers a unique opportunity for AI-driven research that could lead to breakthroughs in predicting, managing, and even reversing diabetes. The dataset is continuously being updated to reflect the latest scientific findings, ensuring it remains a relevant and impactful tool for researchers working in the fields of diabetes, healthcare, and AI.