Skip to content
Snippets Groups Projects
Tarun karthik kumar Mamidi's avatar
Tarun karthik kumar Mamidi authored
Unit test investigation

See merge request center-for-computational-genomics-and-data-science/sciops/covid-19_risk_predictor!4
1f6a89ce

COVID-19_RISK_PREDICTOR

!!! For research purposes only !!!

Aim: To develop a model that takes in demographics, living style and symptoms/conditions to predict risk of COVID-19 infection for patients.

Data availability

Data was made available through the UAB Biomedical Research Information Technology Enhancement (U-BRITE) framework. Access to the level-2 i2b2 data was granted upon self-service pursuant to an IRB exemption. link

Directory structure used to parse data from positive and negative cohorts

Dataset used was transformed to adhere to the OMOP Common Data Model Version 5.3.1 to enable systemic analyses of EHR data from disparate sources.

Cohorts/
├── positive               <--- positive cohort directory
│   ├── measurement.csv - test and results
│   ├── condition_occurance.csv - conditions of patients
│   ├── observation.csv - things like smoking history
│   └── person.csv - demographic information
├── negative                <--- negative cohort directory
│   ├── measurement.csv - test and results
│   ├── condition_occurance.csv - conditions of patients
│   ├── observation.csv - things like smoking history
│   └── person.csv - demographic information
└── README.md

Usage

Installation

Installation simply requires fetching the source code. Following are required:

  • Git

To fetch source code, change in to directory of your choice and run:

git clone -b master \
    git@gitlab.rc.uab.edu:center-for-computational-genomics-and-data-science/sciops/covid-19_risk_predictor.git

Requirements

OS:

Currently works only in Linux OS. Docker versions may need to be explored later to make it useable in Mac (and potentially Windows).

Tools:

  • Anaconda3
    • Tested with version: 2020.02

Activate conda environment

Change in to root directory and run the commands below:

# create conda environment. Needed only the first time.
conda env create --file configs/environment.yaml

# if you need to update existing environment
conda env update --file configs/environment.yaml

# activate conda environment
conda activate rico

Run parser

python src/filter_dataset.py --pos Cohorts/positive/ --neg Cohorts/negative/

For help, use the -h help argument

python src/filter_dataset.py -h

parsed files are saved in ./results directory.

Run model training

python src/Model.py --input results/encoded-100-week-filter.csv

output files are saved in ./results directory.

Build Streamlit app

To demonstrate the application of these models one of the four was chosen and a sample Streamlit app was created and included in the project. Please refer to src/streamlit/RICO.py

Note - This Streamlit app is for demonstration of one of the models and is not a necessity for the pipeline but only for display of calculation and interpretation. The questionnaire from the models can be used manually without this. Hence, the Streamlit app is not tested and should be used at your own risk for demo purposes or as a guide for building from this work.

Unit Testing

To test the functions in filter_dataset.py, use the below command -

python -m unittest -v testing/unit_test.py

To test the coverage of testing, use the below commands -

# test the coverage
coverage run -m unittest -v testing/unit_test.py

# To get a coverage report
coverage report

# To get annotated HTML listings
coverage html

Note - Functions in Model.py are adapted from this Github repo, where they already implemented unit testing.

Contact information

For issues, please send an email with clear description to

Tarun Mamidi - tmamidi@uab.edu

Ryan Melvin - rmelvin@uabmc.edu