COVID-19_RISK_PREDICTOR
!!! For research purposes only !!!
Aim: To develop a model that takes in demographics, living style and symptoms/conditions to predict risk of COVID-19 infection for patients.
Data availability
Data was made available through the UAB Biomedical Research Information Technology Enhancement (U-BRITE) framework. Access to the level-2 i2b2 data was granted upon self-service pursuant to an IRB exemption. link
Directory structure used to parse data from positive and negative cohorts
Dataset used was transformed to adhere to the OMOP Common Data Model Version 5.3.1 to enable systemic analyses of EHR data from disparate sources.
Cohorts/
├── positive <--- positive cohort directory
│ ├── measurement.csv - test and results
│ ├── condition_occurance.csv - conditions of patients
│ ├── observation.csv - things like smoking history
│ └── person.csv - demographic information
├── negative <--- negative cohort directory
│ ├── measurement.csv - test and results
│ ├── condition_occurance.csv - conditions of patients
│ ├── observation.csv - things like smoking history
│ └── person.csv - demographic information
└── README.md
Usage
Installation
Installation simply requires fetching the source code. Following are required:
- Git
To fetch source code, change in to directory of your choice and run:
git clone -b master \
git@gitlab.rc.uab.edu:center-for-computational-genomics-and-data-science/sciops/covid-19_risk_predictor.git
Requirements
OS:
Currently works only in Linux OS. Docker versions may need to be explored later to make it useable in Mac (and potentially Windows).
Tools:
- Anaconda3
- Tested with version: 2020.02
Activate conda environment
Change in to root directory and run the commands below:
# create conda environment. Needed only the first time.
conda env create --file configs/environment.yaml
# if you need to update existing environment
conda env update --file configs/environment.yaml
# activate conda environment
conda activate rico
Run parser
python src/filter_dataset.py --pos Cohorts/positive/ --neg Cohorts/negative/
For help, use the -h
help argument
python src/filter_dataset.py -h
parsed files are saved in ./results
directory.
Run model training
python src/Model.py --input results/encoded-100-week-filter.csv
output files are saved in ./results
directory.
Build Streamlit app
To demonstrate the application of these models one of the four was chosen and a sample Streamlit app was created and included in the project. Please refer to
src/streamlit/RICO.py
Note - This Streamlit app is for demonstration of one of the models and is not a necessity for the pipeline but only for display of calculation and interpretation. The questionnaire from the models can be used manually without this. Hence, the Streamlit app is not tested and should be used at your own risk for demo purposes or as a guide for building from this work.
Unit Testing
To test the functions in filter_dataset.py
, use the below command -
python -m unittest -v testing/unit_test.py
To test the coverage of testing, use the below commands -
# test the coverage
coverage run -m unittest -v testing/unit_test.py
# To get a coverage report
coverage report
# To get annotated HTML listings
coverage html
Note - Functions in Model.py
are adapted from this Github repo,
where they already implemented unit testing.
Contact information
For issues, please send an email with clear description to
Tarun Mamidi - tmamidi@uab.edu
Ryan Melvin - rmelvin@uabmc.edu