As a Cheaha user who needs software installed as a module, what is the procedure?
Problem
There is no well defined procedure for researchers to request software installed directly on Cheaha, as a module.
What happens today?
- Researcher sends in support request for software as a module
- Facilitation follows a rough "triage" plan to determine whether the software can be self-managed. If so, they provide a solution. (
conda, self-install in/home/bin/, self-build, etc.) This filters out most requests. - Everything else makes it to me (William), and I work with Louis to find a solution, or build a module.
- We build the module ad-hoc, and record notes in https://gitlab.rc.uab.edu/rc/cluster-software/-/blob/master/future-ci-cd-notes.md
What would we like to see?
- Researcher fills out a form
- Form becomes a support request
- Facilitation triage
- Identify if module is warranted. If not, report solution to researcher, record for future triage.
- If so, create a repository to automate module build using CI/CD.
Vision
Facilitation has constructed a solution for "Community containers". If a software should be self-installable, but isn't due to out-of-date dependencies on Cheaha (glibc mostly), we build a containerized solution.
We have a GitLab project group called "Community Containers". Each software is in one repo within the group, with CICD that builds a container image, stored in the GitLab container registry. The group has a template/example repository as a form of "how-to" guide. Each repo is expected to have documentation supporting the build process for the repo's image. The researcher-facing "how-to" is (or will be) in the researcher-facing https://docs.rc.uab.edu.
- What is documented and where:
- How do I use a community container? RF Docs
- Where can I find community containers? RF Docs
- What community containers are available? RF Docs (and the group's repo listing in gitlab)
- How do I create a community container? Pointer in RF Docs, content in GitLab
- How do I maintain a particular community container? Content in each repo in GitLab
I want to mirror this solution for module construction, with a few changes.
- "Module Builds" project group in gitlab (or another name)
- Each module definition is in one repo in group.
- Each repo has a modulefile.
- Each repo has CICD that does the following
- Builds or obtains the relevant software artifacts
- If a container works best, make a community container first, and pull the image.
- Write a transparent script that maps commands to the container (how feasible is this?)
- Post-build verification (tests, etc.)
- Pushes the modulefile and artifacts to a module directory on Cheaha as build user (or another user if necessary)
- Sets RC-only access control restrictions
- Runs prod environment verification
- Failure creates a report. The module is inaccessible and unpublished. (May need to change the modulefile name so it doesn't get picked up by automatic cache rebuild)
- Success removes "RC-only" permissions, sets software-specific access controls and permissions, where required, and rebuilds module cache, effectively publishing the module
- Builds or obtains the relevant software artifacts
There should be a template for future reference. Each repo should have a readme supporting maintenance of the relevant software, including peculiarities specific to the software. RF docs require minimal change, because understanding of the module system is relatively mature.