cgpPindel contains the Cancer Genome Projects workflow for Pindel.

The is a lightly modified version of pindel v2.0 with CGP specific processing for:

  • Input file generation
  • Conversion from pindel text output to:
    • tumour and normal BAM alignment files
    • VCF
    • Application of VCF filters.

Details of execution and referencing can be found in the wiki


Docker, Singularity and Dockstore

There are pre-built images containing this codebase on quay.io. When pulling an image you must specify the version there is no latest.

  • cgpPindel quay.io: Contained within this repository
    • Smallest build required to use cgpPindel
    • Not linked to Dockstore (yet)
    • Updated most frequently
  • dockstore-cgpwxs: Contains tools specific to WXS analysis.
  • dockstore-cgpwgs: Contains additional tools for WGS analysis.

These were primarily designed for use with dockstore.org but can be used as normal containers.

The docker images are known to work correctly after import into a singularity image.


When doing a native install please install the following first:

Please see these for any child dependencies.

Once complete please run:

./setup.sh /some/install/location

Please use setup.sh to install any other dependencies. Setting the environment variable CGP_PERLLIBS allows you to to append to PERL5LIB during install. Without this all dependancies are installed into the target area.

Please be aware that this expects basic C compilation libraries and tools to be available.


Initial Nextflow bindings for cgpPindel.

Setup personal nextflow

If you don't have a central nextflow install this will get you running with a limited environment:

# seems silly but a python venv is a nice way to handle this during dev
python3 -m venv .venv
source .venv/bin/activate
# compute head nodes may need you to limit Java accessing all memory
export NXF_OPTS="-Xms500M -Xmx2G"
curl get.nextflow.io | bash
mv nextflow .venv/bin/.

If you have any issues installing refer to Nextflow documentation, not the issue tracker for this repo.


Refer to nextflow for an explanation of profiles. The following are available:

  • Job management
    • local
      • spawned jobs use execution host
    • lsf
      • spawned jobs are submitted via bsub
  • Execution method
    • <none>
      • Expects to find programs in PATH
    • singularity
      • Provide image file via -with-singularity [singularity image]
    • docker
      • Provide image via -with-docker [docker image]

For example to run on a LSF farm with singularity the profile would be:

... -profile lsf,singularity ...

While native install with lsf would be:

... -profile lsf ...

Workflow entry points

There are 2 top level entry points:

... -entry pindel_pl ...
# or
... -entry np_generation ...


Executes a tumour/normal paired analysis.

CPU and memory are controlled via nextflow.config, configs are additive, see nextflow documentation.

For the workflow options run:

nextflow -entry pindel_pl --help


Executes pindel with a "dummy" tumour for a file listing of input BAMs.

CPU and memory are controlled via nextflow.config, configs are additive, see nextflow documentation.

For the workflow options run:

nextflow -entry np_generation --help


The nextflow code has been implemented with DSL2 so that the workflows can be composed into larger components.

The items above can be addressed for this purpose via:

  • subwf_pindel_pl
  • subwf_np_gen


Please use pre-commit on this project. You can install to $HOME/bin via:

curl https://pre-commit.com/install-local.py | python -

In you checkout please run:

pre-commit install

Updating licence headers

Please use skywalking-eyes.

Expected workflow:

# recent build, change to apache/skywalking-eyes:0.2.0 once released
export DOCKER_IMG=ghcr.io/apache/skywalking-eyes/license-eye
  1. Check state before modifying .licenserc.yaml:
    • docker run -it --rm -v $(pwd):/github/workspace $DOCKER_IMG header check
    • You should get some 'valid' here, those without a header as 'invalid'
  2. Modify .licenserc.yaml
  3. Apply the changes:
    • docker run -it --rm -v $(pwd):/github/workspace $DOCKER_IMG header fix
  4. Add/commit changes

This is executed in the CI pipeline.

DO NOT edit the header in the files, please modify the date component of content in .licenserc.yaml. The only exception being:

  • README.md

If you need to make more extensive changes to the license carefully test the pattern is functional.

Code changes

This project is maintained using the HubFlow methodology.

  1. Make appropriate changes
  2. Update perl/lib/Sanger/CGP/Pindel.pm to the correct version (adding rc/beta to end if applicable).
  3. Update CHANGES.md to show major items.
  4. Commit the updated docs and updated module/version.
  5. Push commits.


Regression CI

An internal CI system is used to validate each release using real, large scale datasets.

Public CI

Circleci is used to:

  • Build Docker image (unit tests are part of build)
  • Validate expected tools exist
  • For tags only: push image to quay.io

CI only runs for:

  • Branches with pull-requests
  • Default branch (dev)
  • Tags

Cutting the release

Internal regression CI processes must be completed prior to this.

  1. Check state on Circleci
  2. Generate the release (add notes to GitHub)
  3. Confirm that image has been pushed to quay.io


