Skip to content
Snippets Groups Projects

ENCODE HiC Installation and Setup

  • Clone with SSH
  • Clone with HTTPS
  • Embed
  • Share
    The snippet can be accessed without any authentication.
    Authored by Matthew K Defenderfer

    Referencing ticket RITM0714040 for Manuel Rosa-Garrido. Manuel needs to run an altered version of the HiC pipeline on Cheaha. Prior debugging on the standard HiC pipeline was not fruitful so a workstation was used instead. The workstation isn't sufficient for the current analysis so we needed to get this version of the pipeline up and running on Cheaha. This snippet describes setup steps specifically for running ENCODE's HiC pipeline on Cheaha.

    Edited
    name: encode
    channels:
      - conda-forge
      - defaults
    dependencies:
      - _libgcc_mutex=0.1=main
      - _openmp_mutex=5.1=1_gnu
      - alsa-lib=1.2.3.2=h166bdaf_0
      - bzip2=1.0.8=h5eee18b_6
      - ca-certificates=2024.7.2=h06a4308_0
      - cairo=1.16.0=hb05425b_5
      - expat=2.6.2=h6a678d5_0
      - fontconfig=2.14.1=h4c34cd2_2
      - freetype=2.12.1=h4a9f257_0
      - giflib=5.2.1=h5eee18b_3
      - glib=2.78.4=h6a678d5_0
      - glib-tools=2.78.4=h6a678d5_0
      - graphite2=1.3.14=h295c915_1
      - harfbuzz=2.8.1=h6f93f22_0
      - icu=58.2=he6710b0_3
      - jpeg=9e=h5eee18b_1
      - lcms2=2.12=h3be6417_0
      - ld_impl_linux-64=2.38=h1181459_1
      - lerc=3.0=h295c915_0
      - libdeflate=1.17=h5eee18b_1
      - libffi=3.4.4=h6a678d5_1
      - libgcc-ng=11.2.0=h1234567_1
      - libglib=2.78.4=hdc74915_0
      - libgomp=11.2.0=h1234567_1
      - libiconv=1.16=h5eee18b_3
      - libpng=1.6.39=h5eee18b_0
      - libstdcxx-ng=11.2.0=h1234567_1
      - libtiff=4.5.1=h6a678d5_0
      - libuuid=1.41.5=h5eee18b_0
      - libwebp-base=1.3.2=h5eee18b_0
      - libxcb=1.15=h7f8727e_0
      - libxml2=2.10.4=hcbfbd50_0
      - lz4-c=1.9.4=h6a678d5_1
      - ncurses=6.4=h6a678d5_0
      - openjdk=11.0.9.1=h5cc2fde_1
      - openssl=3.0.14=h5eee18b_0
      - pcre2=10.42=hebb0a14_1
      - pip=24.0=py311h06a4308_0
      - pixman=0.40.0=h7f8727e_1
      - python=3.11.9=h955ad1f_0
      - readline=8.2=h5eee18b_0
      - setuptools=69.5.1=py311h06a4308_0
      - sqlite=3.45.3=h5eee18b_0
      - tk=8.6.14=h39e8969_0
      - wheel=0.43.0=py311h06a4308_0
      - xorg-fixesproto=5.0=h7f98852_1002
      - xorg-inputproto=2.3.2=h7f98852_1002
      - xorg-kbproto=1.0.7=h7f98852_1002
      - xorg-libx11=1.7.2=h7f98852_0
      - xorg-libxext=1.3.4=h7f98852_1
      - xorg-libxfixes=5.0.3=h7f98852_1004
      - xorg-libxi=1.7.10=h7f98852_0
      - xorg-libxrender=0.9.10=h7f98852_1003
      - xorg-libxtst=1.2.3=h7f98852_1002
      - xorg-recordproto=1.14.2=h7f98852_1002
      - xorg-renderproto=0.11.1=h7f98852_1002
      - xorg-xextproto=7.3.0=h7f98852_1002
      - xorg-xproto=7.0.31=h27cfd23_1007
      - xz=5.4.6=h5eee18b_1
      - zlib=1.2.13=h5eee18b_1
      - zstd=1.5.5=hc292b87_2
      - pip:
          - argcomplete==3.4.0
          - autouri==0.4.4
          - awscli==1.33.26
          - boto3==1.34.144
          - botocore==1.34.144
          - bullet==2.2.0
          - cachetools==5.4.0
          - caper==2.3.2
          - certifi==2024.7.4
          - cffi==1.16.0
          - charset-normalizer==3.3.2
          - colorama==0.4.6
          - coloredlogs==15.0.1
          - contourpy==1.2.1
          - cryptography==42.0.8
          - cycler==0.12.1
          - dateparser==1.2.0
          - docker==7.1.0
          - docutils==0.16
          - filelock==3.15.4
          - fonttools==4.53.1
          - google-api-core==2.19.1
          - google-auth==2.32.0
          - google-cloud-core==2.4.1
          - google-cloud-storage==2.17.0
          - google-crc32c==1.5.0
          - google-resumable-media==2.7.1
          - googleapis-common-protos==1.63.2
          - humanfriendly==10.0
          - idna==3.7
          - importlib-metadata==8.0.0
          - jmespath==1.0.1
          - joblib==1.4.2
          - kiwisolver==1.4.5
          - lark==1.1.9
          - matplotlib==3.9.1
          - miniwdl==1.12.0
          - ntplib==0.4.0
          - numpy==2.0.0
          - packaging==24.1
          - pandas==2.2.2
          - pillow==10.4.0
          - proto-plus==1.24.0
          - protobuf==5.27.2
          - psutil==5.9.8
          - pyasn1==0.6.0
          - pyasn1-modules==0.4.0
          - pycparser==2.22
          - pygtail==0.14.0
          - pyhocon==0.3.61
          - pyopenssl==24.1.0
          - pyparsing==3.1.2
          - python-dateutil==2.9.0.post0
          - python-json-logger==2.0.7
          - pytz==2024.1
          - pyyaml==6.0.1
          - regex==2024.5.15
          - requests==2.32.3
          - rsa==4.7.2
          - s3transfer==0.10.2
          - scikit-learn==1.5.1
          - scipy==1.14.0
          - six==1.16.0
          - threadpoolctl==3.5.0
          - tzdata==2024.1
          - tzlocal==5.2
          - urllib3==2.2.2
          - xdg==6.0.0
          - zipp==3.19.2

    Clone HiC Repo

    This may not be necessary, but running the various tests to get a feel for the inputs and outputs is nice.

    # if using git ssh key
    git clone git@github.com:ENCODE-DCC/hic-pipeline.git

    Environment Setup

    The HiC pipeline is controlled by the Caper job manager. Caper is available on PyPi and only requires an additional installation of the Java Development Kit to function. Java can also be installed via conda for ease of use, or the Java module can be loaded on Cheaha. The following commands will set up an encode environment. The env.yml is included in this snippet.

    module load Anaconda3
    conda create -n encode -f env.yml

    After the environment is created, you only need to activate the environment to run caper now and in the future.

    conda activate encode

    Caper Configuration

    Some initial configuration is needed for Caper to run correctly. First, activate the conda environment using the command above. Then run the following:

    caper init slurm

    This will set up some configuration files for caper. You will need to change the $HOME/.caper/default.conf file to have the following:

    backend=slurm
    slurm-partition=amd-hdr100,long
    slurm-leader-job-resource-param=-t 150:00:00 --mem 8G
    local-loc-dir=/scratch/<username>/caper_cache
    cromwell=/home/<username>/.caper/cromwell_jar/cromwell-82.jar
    womtool=/home/<username>/.caper/womtool_jar/womtool-82.jar

    Replace <username> with your username before running Caper. It does not look like it expands bash environment variables in the paths, so using $USER_SCRATCH or $USER does not work.

    This default configuration will submit all jobs to the amd-hdr100 or long partitions. All of the nodes in that partition have enough resources to run each job in the test pipeline, but these are subject to change based on the analysis.

    Running Caper Tests

    Change directory to wherever you would like the outputs to be saved. This section assumes you have cloned the pipeline repository, are in the top level directory for the repo, and are running the general HiC test.

    caper hpc submit hic.wdl -i tests/functional/json/test_hic.json --singularity --leader-job-name test_encode_hic

    This will submit a leader job that manages the other jobs in the pipeline. You can monitor the status of the child jobs using squeue -u $USER.

    0% Loading or .
    You are about to add 0 people to the discussion. Proceed with caution.
    Finish editing this message first!
    Please register or to comment