Installation

This page shows how to install the workflow-manager as a Python package in your python environment. Once installed, you can use import it by adding import workflow_manager in your code.

Prerequisites

Please install the following system packages

  • mongodb-server - MongoDB is used for data storage.

apt update
apt install -y dcmtk python3.6 build-essential python3.6-dev python3-pip mongodb-server python-pymongo python-psutil python-tables

Local Installation

  1. Clone the GitHub repository:

    git clone https://github.com/physiome-workflows/workflow-manager.git
    

    Note

    This is a private repository so you will need a GitHub account and have permissions to the repository.

  2. Add workflow-manager(WM) to Python environment

    Once cloned, you need to add the path to the workflow-manager(WM) module to your Python path environment variable. This can be added to your ~/.bashrc file.

    export PYTHONPATH=$PYTHONPATH:~/path_to_workflow-manager/workflow_manager
    

    Note

    Once saved, you can run echo $PYTHONPATH to see if the path has been added to PYTHONPATH. If not, you might need to reload the .bashrc file by source ~/.bashrc or open a new terminal session.

  3. Install Python dependencies

    Using Python virtual environment is recommended. Below shows the steps to install the dependencies from the dependency file requirements.txt into a python virtual environment.

    1. Create a virtual environment

      python3 -m venv venv/
      
    2. Activate the virtual environment

      source venv/bin/activate
      

      Note

      To deactivate a virtual environment, run

      deactivate
      
    3. Update pip

      pip install --upgrade pip
      
    4. Install the dependencies via requirements.txt

      pip install -r requirements.txt
      

Container-based installation

Here is a complete example on how to setup and run a simple workflow via container-based approach.

Docker

1. Getting the workflow-manager docker image

You can get the docker image of workflow-manager by:

Pulling image from Docker Hub
sudo docker pull clin864/workflow-manager
Building image from Dockerfile

Build the docker image from Dockerfile in /path/to/workflow-manager/docker/ubuntu

  1. Navigate to /path/to/workflow-manager

  2. Run

    sudo docker build -f ./docker/ubuntu/Dockerfile --tag workflow-manager .
    

Note

FYI, below is the dockerfile we just used.

ARG OS_VERSION=18.04
FROM ubuntu:${OS_VERSION}

USER root

# Environment variables
ARG PROJECT_NAME=wm_project
ENV PROJECT_NAME=$PROJECT_NAME
ENV PROJECT_ROOT=/$PROJECT_NAME

ARG SCRIPTS=/scripts
ENV SCRIPTS=$SCRIPTS

ARG MODULES_DIR=/opt
ENV MODULES_DIR=$MODULES_DIR

ARG RESOURCES=/resources
ENV RESOURCES=$RESOURCES

ARG EXAMPLES=/examples
ENV EXAMPLES=$EXAMPLES

ARG RESULTS=/results
ENV RESULTS=$RESULTS

ARG DICOM_NODE=/dcmtk/received
ENV DICOM_NODE=$DICOM_NODE

# Make dirs
RUN mkdir -p $MODULES_DIR
RUN mkdir -p $SCRIPTS
RUN mkdir -p $PROJECT_ROOT
RUN mkdir -p $DICOM_NODE
RUN mkdir -p /mongodb/data/db
RUN mkdir -p $RESULTS
RUN mkdir -p $RESOURCES

# Install system packages
RUN apt update && apt install -y \
    build-essential \
    cron \
    dcmtk \
    mongodb-server \
    python3.6 \
    python3.6-dev \
    python3-pip \
    python-psutil \
    python-pymongo \
    python-tables \
    && apt clean && rm -rf /var/lib/apt/lists/*

# Copy files
ADD ./requirements.txt /requirements.txt
ADD ./docker/ubuntu/entrypoint.sh /entrypoint.sh
COPY ./docker/scripts $SCRIPTS
COPY ./examples $EXAMPLES

# Copy the custom packages to python site-package
COPY ./workflow_manager $MODULES_DIR/workflow_manager
RUN cp -R $MODULES_DIR/workflow_manager /usr/local/lib/python3.6/dist-packages

# Install Python dependencies
RUN python3 -m pip install --upgrade pip
RUN python3 -m pip install --no-cache-dir -r /requirements.txt

# Copy cron file to the cron.d directory
COPY ./docker/ubuntu/workflow-manager-cron /etc/cron.d/workflow-manager-cron
# Give execution rights on the cron job
RUN chmod 0644 /etc/cron.d/workflow-manager-cron
# Apply cron job
RUN crontab /etc/cron.d/workflow-manager-cron

WORKDIR $DICOM_NODE
CMD /bin/bash /entrypoint.sh

# How to build the image
# 1. navigate to /path/to/workflow-manager
# 2. sudo docker build -f ./docker/ubuntu/Dockerfile --tag workflow-manager .

# How to run the image
# 1. Create the following folders at any localtion:
#    - dicom_node
#    - test_project
#    - mongodb
#    - results
# 2. sudo docker run -p 8105:8105 -v /path/to/dicom_node:/dcmtk -v /path/to/test_project:/wm_project -v /path/to/mongodb:/mongodb/data/db -v /path/to/results:/results workflow-manager

# Run example pipeline
# - sudo docker run -it -e MODE=examples workflow-manager
# OR
# - sudo docker run -it -e MODE=examples -v /path/to/test_project:/wm_project -v /path/to/mongodb:/mongodb/data/db -v /path/to/results:/results workflow-manager
# Where:
#   - The environment variable RUN_EXAMPLES specify whether to run the example pipeline.
#   - Can also do folder mapping to save data locally (everything in docker will be deleted once the container is deleted). Some important folders in the docker images:
#       - /wm_project: the project root folder which contains three subfolders: 1. processes 2. scripts 3. workspaces
#       - /mongodb/data/db: the mongoDB database
#       - /results: the results folder can be used for storing the workflow results

2. Run the docker image

Run the example workflow
  1. Run the docker image:

    sudo docker run -it -e MODE=examples workflow-manager
    

    Note

    The results will be saved in the results folder in the container we just launched using the docker run command. We can also map a folder inside a container to a local folder by adding a -v (or –volume ) argument to the run command. For example, sudo docker run -it -e RUN_EXAMPLES=TRUE -v /path/to/results:/results workflow-manager

  2. Stop the container. See 3. Stop a docker container

Run a custom workflow

This docker image also allow you to run yor own workflow by passing the scripts, data and any resource the workflow will be using into the docker container.

  1. Download and unzip the resource folder, and put all the input resources inside the folder. This resources folder contains:

    • ./scripts: folder which contains your custom scripts. Note that the scripts need to be converted/written in the format that the the workflow-manager supports. Please see this Example Script for reference.

    • ./data: folder where you put the input data

    • requirements.txt: list all python dependencies of your scripts in this file

    • project_setup.py: modify this script to set up your own workflow

      import os
      import sys
      
      import workflow_manager as wm
      
      if __name__ == '__main__':
          project_name = sys.argv[1]
          project_root = sys.argv[2]
      
          os.makedirs(project_root)
          P = wm.create_project(project_name, root_dir=project_root)
      
          P.import_script('./scripts/script1.py')
          P.import_script('./scripts/script2.py')
          P.import_script('./scripts/script3.py')
      
          script = P.script('script1')
          script_input_arguments = {'path': 'relativePathToInputData/pretend_data.txt', 'send_dir': os.getenv('RESULTS')}
          script.run(script_input_arguments)
      
          wm.project.start_process_monitor(project_name, minutes_alive=999, sleep_time=3, total_cores=8)
      
  2. (optional) Create the following folders to save the project, database and results locally. In the next step, we will do folder mapping between local folders and the folders inside the container. Otherwise, you will lose all the data once the container is terminated.

    • project_folder/

    • database_folder/

    • result_folder/

  3. Run the docker image

    sudo docker run -v /path/to/resources:/resource -v /path/to/project_folder:/wm_project -v /path/to/database_folder:/mongodb/data/db -v /path/to/results:/results workflow-manager
    

    Note

    In the custom workflow, the final results will not automatically sent to the results folder. The results by default will just be save in the project workspace(s) depanding on how you set up your workflow. E.g /wm_project/workspaces/0003 You can either A. Map your local results folder to a final workspace or B. Send all the results from the project workspace(s) to the /results folder inside docker, then do a mapping between the local results folder and the results folder inside docker.

  4. Stop the container. See 3. Stop a docker container

3. Stop a docker container

  1. Get container id

    sudo docker ps
    
  2. Stop and delete the container

    sudo docker rm -f <container_id>
    

Singularity

1. Build the Singularity image based on the pre-built docker image

  1. See 1. Getting the workflow-manager docker image to build/pull the docker image.

  2. Save the docker image as a .tar file

    sudo docker save workflow-manager > workflow-manager.tar
    
  3. Build the Singularity image from the pre-built docker image

    singularity build workflow-manager.sif docker-archive://workflow-manager.tar
    

2. Run the Singularity image

The singularity run command is very similar to docker run. Please have a look at the Docker section to get more ideas of how to run the example or a custom workflow. For example, the docker -v argument needs to be replaced with the singularity -B when doing folder mapping.

  • Run the example workflow

    singularity run -shell --env MODE=examples workflow-manager
    
  • Run a custom workflow

    singularity run -B /path/to/resources:/resource -B /path/to/project_folder:/wm_project -B /path/to/database_folder:/mongodb/data/db -B /path/to/results:/results /path/to/workflow-manager.sif