About this guide

This BioExcel Best Practice Guide aims to be a lightly opinionated guide to help you decide which implementation of CWL is best suited to your needs. The opinions given here are based off the individual engines own claims and our experience with them.

Overview of Common Workflow Language

For an introduction to Common Workflow Language, see:

BioExcel Best Practice Guides include:

How to use this guide.

This guide is written from the point of view of features. You can read the whole guide, or if certain features are of particular interest you can read just those sections. In each section we shall cover each workflow engine and how it supports each feature.

What engines are covered

We are only covering those workflow engines that are stable and in production, not those still being developed and lacking support or functionality. The ones we will consider are:

  • Toil - command line tool driven, can connect to multiple cloud/cluster compute backends
  • cwltool - reference implementation, only local execution
  • Arvados - client/server with Web interface and CLI
  • CWL-Airflow - extends Apache Airflow with CWL support
  • REANA - Kubernetes execution, CWL is one of the supported languages

In addition to these we will cover popular workflow engines that lack full CWL support.

Summary of engines

Feature Airflow Arvados REANA Toil Cromwell Galaxy
Documentation 🚧
How-Tos 🚧 🚧
Install guides ⚠️
GUI ️✅
CLI ⚠️
Demo ⚠️
Local install ⚠️ ⚠️ ⚠️
Cluster ⚠️
Cloud ⚠️ ⚠️ ⚠️
Complex setup ⚠️ ⚠️ ⚠️ ️✅
Complex use ⚠️ ⚠️ ⚠️ ️✅
CWL version v1.1 v1.2 v1.0? v1.2.0 partial v1.0 Unknown

🚧 - Work-in-progress
✅ - Support
❌ - No support
⚠️ - Complicated

No single workflow engine is the right for every user. We recommend you explore the engines according to this guide in the order prescribed above.

The CWL list of implementations also include partial CWL implementations that can be considered for more specialist use cases, e.g. CWLEXEC for IBM Spectrum LSF customers.