-
Angelina Elizabeth Uno-Antonison authored50f8e5a4
UAB Compute Cluster a.k.a. Cheaha
Overview
Cheaha is a large, multi-unit computational system for running massively parallel compute tasks. It is managed by the UAB Research Computing Group
Cheaha is currently the fastest supercomputer in the state of Alabama with a theoretical throughput of approximately 450 TFlop/s (HUGE COMPUTE!) and consists of over 3000 CPU cores and 72 NVIDIA-P100 GPU's. Cheaha is supported by a high-speed parallel filesystem (GPFS) that can store 6 PB non-redundantly and 4 PB redundantly (with more to come!) interconnected by a high speed infiniband network. UAB researchers use Cheaha for wide variety of research such as genomics, neuro-imaging, machine learning, statistical genetics, cancer detection etc.
Use of this resource is governed by the UAB Acceptable Use Policy for Computer and Network Resources
For more information on Cheaha and the tools available to support research please review the documentation: http://docs.uabgrid.uab.edu/wiki/Cheaha
Access
To get setup with cluster access you'll need your BlazerID and send an email to the cluster
support group (support@listserv.uab.edu
).
You can use this template email filled in with your information to make this request.
Hello!
My name is __YOUR_NAME__ and I’m a __TITLE__ in Dr. Liz Worthey’s lab.
I’d like to request access to the cluster for our Genomics, Genetics and Data
Science research. In particular I will be doing data analysis, pipeline
development, and genomics research using the compute resources of the cluster.
Sincerely,
YOUR NAME
TITLE
Dr. Liz Worthey’s Lab
Center For Computational Genomics and Data Science
Storage spaces
- Scratch Space
- 1 TB of fast storage (i.e. close to the compute for super fast input/output)
- Home Space
- 50 GB of fast-ish storage for small data, scripts, small analyses, etc.
- User Data Directory
- 20 TB of fast-ish storage for larger data needs
- Lab/Project Space
- 50 - 100 TB per lab of fast-ish storage for project level data and analysis
- Commodity Storage (coming soon!)
- ??? TB of slower storage but HUGE for bigger datasets
Submitting Jobs
You can SSH into the cluster via
ssh BLAZERID@cheaha.rc.uab.edu
The cluster uses the Slurm queue management system (stands for Simple Linux Utility for Resource Management) for scheduling, distributing, and managing compute "jobs". A "job" is just a general term used to describe doing a specific task, or set of tasks (specified in script) on the compute contained within the cluster.
For a complete description and tutorial of writing and executing jobs on the cluster see Research Computing's helpful guide on Slurm and executing compute tasks on the cluster. You can also check out the below tutorial for a quick high level view of the cluster.
Python on the Cluster
Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. Package versions are managed by the package management system conda. CDGS plans on using conda on the cluster for multiple projects involving the use of python.
Conda Shortcuts for cluster
-
Enabling Conda Module on Cluster
module avail Anaconda
-
Creating new Conda Environment
conda create --name test_env
Packages can be included within the new environment with a similar command
conda create --name test_env PACKAGE_NAME
-
List available virtual environments available
conda env list
Virtual environment with the asterisk(*) next to it is the one that's currently active -
Activating conda virtual environment
source activate test_env
-
Deactivating Virtual Environment
source deactivate
-
Export Conda virtual environmnet to share
conda env export -n test_env > environment.yml
-
Creating Conda Virtual Environment from environment.yml
conda env create -f environment.yml -n test_env
-
Deleting a Conda Virtual Environment
conda remove --name test_env --all
For a complete tutorial and for a most up-to-date version, please use the tutorial from UAB Research Compute's Anaconda Wiki.