Troubleshooting
===============

.. contents:: Table of Contents

Error executing process
-----------------------

While writing a new pipeline or by running and existing pipeline you could find error
in a particular step, which interrupt nextflow execution and let you to fix issues,
for example::

  [a1/60b160] process > mirdeep2 (null)                                [100%] 1 of 1, failed: 1 ✘
  Error executing process > 'mirdeep2 (null)'

  Caused by:
    Missing output file(s) `novel_mature.fa` expected by process `mirdeep2 (null)`

  Command executed:

    miRDeep2.pl all_samples.fasta ARS-UCD1.2_chrOnly_chrY.fa all_samples.arf bta_mature.fa.fix chi-oar-hsa_mature.fa.fix bta_hairpin.fa.fix -P

  Command exit status:
    0

  Command output:

  <omitting lines>

  Work dir:
    /home/cozzip/nf-mirna/work/a1/60b160bf3021f4891bbf42746173b0

  Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

In this error message, is reported the directory in which the pipeline found such error.
You can get the same information by getting logs from the nextflow row. For example,
supposing that our last run is named ``sharp_feynman`` (you can get information about
run name using ``nextflow log`` or ``nextflow log -quiet``), you can get information
about steps working dir by printing specific *fields* with ``nextflow log``, for
example:

.. code-block:: bash

  $ nextflow log sharp_feynman -f 'process,status,exit,hash,duration,workdir'
  remove_whitespaces      COMPLETED       0       bd/2ebe9a       551ms   /home/cozzip/nf-mirna/work/bd/2ebe9a9f2e1703a18059fbdf1191e7
  fastqc  CACHED  0       93/a6692c       1m 39s  /home/cozzip/nf-mirna/work/93/a6692cb2a6c04c08546f71b1814772
  trim_galore     CACHED  0       88/984f53       1m 59s  /home/cozzip/nf-mirna/work/88/984f537d2e9641238d42906a959b17
  mirdeep_input   CACHED  0       97/d4cd0b       77ms    /home/cozzip/nf-mirna/work/97/d4cd0bf8379abaeee37e4de1297127
  mirdeep CACHED  0       b8/36c9c9       3m 38s  /home/cozzip/nf-mirna/work/b8/36c9c9a2265e03124eeaf39ca539b0
  mirdeep2        FAILED  0       a1/60b160       1h 3m 22s       /home/cozzip/nf-mirna/work/a1/60b160bf3021f4891bbf42746173b0

In such example, you can see that the only failed step is ``mirdeep2`` in the
same folder we get from nextflow error report.

.. tip::

  Get all field names with ``nextflow log -l`` or see the *execution report table*
  at `Trace report <https://www.nextflow.io/docs/latest/tracing.html?highlight=scratch#trace-report>`_

.. note::

  This analysis is failed but has ``0`` as exit status. If you inspect the nextflow
  error, you can see that there's no problem in the execution step, however nextflow
  is expecting some output files that this analysis doesn't provide. This could be
  an error in nextflow configuration.

Now is time to understand what happened. Enter in the failed job work directory an
list all files (including hidden ones) with `ls -a`:

.. code-block:: bash

  $ ls -a .command*
  .command.begin  .command.err  .command.log  .command.out  .command.run  .command.sh

In such example, we choose to display only ``.command*`` hidden files: those files are
generated by nextflow, they contain pipeline output and also the command to perform
such step. In particular, ``.command.run`` keeps all the instruction to prepare
the working directory and to launch ``.command.sh``, which contains the ``script``
parameter in the pipeline configuration files.

In order to have information on errors, we can manually execute the nextflow steps:
first of all, we need to export an environment variable in order to increase
nextflow verbosity:

.. code-block:: bash

  export NXF_DEBUG=2

Next we can execute the ``.command.run`` scripts, which is executed by nextflow and
that call ``.command.sh``:

.. code-block:: bash

  bash .command.run

Command is expected to fail (since nextflow returned an error previously). However
by setting ``NXF_DEBUG=2``, we can see all commands launched by nextflow and in
particular the ``singularity`` command launched by nextflow. Next we can take such
command, simplify it and launch a singularity session in order to test our command
using a terminal inside the same singularity container used by our pipeline
step, for example with:

.. code-block:: bash

  singularity exec -B $HOME -B /home/ -B $PWD/ /home/core/nxf_singularity_cache/bunop-mirdeep2.img  /bin/bash

Where all ``-B`` parameters indicate all folders that will be mounted inside our
container (such as our ``$HOME`` directory, the ``/home`` directory, which is the
common position where we can find input files, and ``$PWD``, which is our nextflow
folder in which we found an error), next there is the physical location of our
singularity image (``/home/core/nxf_singularity_cache/bunop-mirdeep2.img`` in this example)
and then the command we want to run, in such case a new terminal
since we want to run ``.command.sh`` manually and see why this is raising an error.

.. tip::

  If you cannot recover from the error, you can apply a custom configuration
  file to the pipeline in order to ignore the failed step. You can find more
  information in the :ref:`handling-failing-jobs` section
  of this guide.

Failed to pull singularity image
--------------------------------

Sometimes singularity cannot download an image from https://quay.io/. In such case,
nextflow will raise an error and will stop the execution like this::

  Error executing process > 'RNASEQ:QUANTIFY_SALMON:SALMON_SE_TRANSCRIPT (salmon_tx2gene.tsv)'

  Caused by:
    Failed to pull singularity image
    command: singularity pull  --name quay.io-biocontainers-bioconductor-summarizedexperiment-1.18.1--r40_0.img.pulling.1610634041691 docker://quay.io/biocontainers/bioconductor-summarizedexperiment:1.18.1--r40_0 > /dev/null
    status : 255
    message:
      INFO:    Converting OCI blobs to SIF format
      INFO:    Starting build...
      Getting image source signatures
      Copying blob sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
      Copying blob sha256:77c6c00e8b61bb628567c060b85690b0b0561bb37d8ad3f3792877bddcfe2500
      Copying blob sha256:3aaade50789a6510c60e536f5e75fe8b8fc84801620e575cb0435e2654ffd7f6
      Copying blob sha256:00cf8b9f3d2a08745635830064530c931d16f549d031013a9b7c6535e7107b88
      Copying blob sha256:7ff999a2256f84141f17d07d26539acea8a4d9c149fefbbcc9a8b4d15ea32de7
      Copying blob sha256:d2ba336f2e4458a9223203bf17cc88d77e3006d9cbf4f0b24a1618d0a5b82053
      Copying blob sha256:dfda3e01f2b637b7b89adb401f2f763d592fcedd2937240e2eb3286fabce55f0
      Copying blob sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
      Copying blob sha256:10c3bb32200bdb5006b484c59b5f0c71b4dbab611d33fca816cd44f9f5ce9e3c
      Copying blob sha256:f981c3bfe61f7355e034d40b620e60aefc6b272a8d0ac10fa9e1892bb6b17b56
      Copying config sha256:ff870dedc9d11d9622344d7a4ff0c0c25a890f2233a84926b6cb0e67f422500e
      Writing manifest to image destination
      Storing signatures
      FATAL:   While making image from oci registry: error fetching image to cache: while building SIF from layers: conveyor failed to get: no descriptor found for reference "70c154f9aee9152d9e03c474cd4b5e5eee5856cda5b62c46b10c4ae7932e763d"

In such cases, you can solve those errors by manually download the singularity image
into ``$NXF_SINGULARITY_CACHEDIR`` cache directory. Track the failed ``command`` line
in nextflow output, then move in ``$NXF_SINGULARITY_CACHEDIR`` directory and call
such command manually. After downloading the image, rename the file and remove the
``.pulling.[0-9]*`` from the image name (nextflow images should end with ``.img``
extension). For example in the previous case:

.. code-block:: bash

  cd $NXF_SINGULARITY_CACHEDIR
  singularity pull  --name quay.io-biocontainers-bioconductor-summarizedexperiment-1.18.1--r40_0.img.pulling.1610634041691 docker://quay.io/biocontainers/bioconductor-summarizedexperiment:1.18.1--r40_0 > /dev/null
  mv quay.io-biocontainers-bioconductor-summarizedexperiment-1.18.1--r40_0.img.pulling.1610634041691 quay.io-biocontainers-bioconductor-summarizedexperiment-1.18.1--r40_0.img

After that, you could resume your nextflow pipeline by adding the ``-resume`` option
in your command line in order using the cached results of the previous calculations

.. note::

  nextflow singularity containers are moving from `quay <https://quay.io/>`_ to
  `depot.galaxyproject.org <https://depot.galaxyproject.org/singularity/>`_:
  the latter seems to have better downloading performance

.. _nextflow-version-required:

Nextflow version does't match the required version
------------------------------------------------------

It is possible that when running a pipeline with nextflow, you will get a error
like this::

  Nextflow version 20.10.0 does not match workflow required version: >=20.11.0-edge

Is such case, you have two options. The first is to execute a previous version of
the pipeline that is compatible with your nextflow version. You can have information
on version on `nf-core pipeline <https://nf-co.re/pipelines>`_ or directly
from the GitHub project of `nf-core <https://github.com/nf-core>`_ organization.
Once you find your desired version, you have to declare it with the parameter
``-r`` when calling nextflow, for example:

.. code-block:: bash

  nextflow run nf-core/rnaseq -r 2.0 -profile test,singularity -resume

The second option is to upgrade your nextflow version. You can install a specific
version of nextflow from the `nextflow release page <https://github.com/nextflow-io/nextflow/releases>`_
Copy the nextflow asset link present in every release, and then install nextflow like
this:

.. code-block:: bash

  wget -qO- https://github.com/nextflow-io/nextflow/releases/download/v20.12.0-edge/nextflow-20.12.0-edge-all | bash

This will download all the requirements and will put nextflow in your current directory.
Change the nextflow default permissions to ``755`` and move such executable in a
directory with a higher position in your ``$PATH`` environment, for example ``$HOME/bin``

Cannot find pipeline version
----------------------------

Sometimes is possible that you cannot find a specific version of a pipeline that
you know is present in the remote repository with an error like this::

  Cannot find revision `x.x.x` -- Make sure that it exists in the remote repository

This could happen if your local version of the pipeline (in your ``$HOME/.nextflow/assets/``)
is not updated with the remote repository. In this case, you need to synchronize your local
version with the remote repository, for example:

.. code-block:: bash

  nextflow pull nf-core/methylseq

You can also specify a specific version of the pipeline to pull, for example:

.. code-block:: bash

  nextflow pull nf-core/methylseq -r 2.7.1

This will update your local version of the pipeline, and you will be able to call
the desired version of the pipeline.

Cannot execute nextflow interactively
-------------------------------------

In HPC environment when the resources are limited in the login nodes, nextflow cannot
be executed interactively. In such case, nextflow need to be submitted to a job
scheduler. For example, in a SLURM environment, you can define a nextflow job
like this:

.. code-block:: bash

  #!/bin/bash
  #SBATCH --nodes=1                       # 1 node
  #SBATCH --ntasks-per-node=1             # 1 tasks per node
  #SBATCH --cpus-per-task=2               # 2 CPUs per task
  #SBATCH --time=4-00:00:00               # time limits if you are forced to use
  #SBATCH --mem=16G                       # 16GB to manage process
  #SBATCH --error=nextflow.err            # standard error file
  #SBATCH --output=nextflow.out           # standard output file
  #SBATCH --job-name=nf-core-rnaseq       # job name
  #SBATCH --account=<your account>        # account name
  #SBATCH --partition=<your partition>    # partition name were this job will run
  #SBATCH --qos=<your QoS>                # quality of service (if any)
  nextflow run nf-core/rnaseq -r 3.12.0 -profile "singularity,..." \
    -resume -config custom.config -params-file rnaseq-nf-params.json

Next you will require to configure nextflow to not working interactively and
limiting some resources. For example you may require to disable the ansi-log,
since you are not working interactively and all your standard output will be
redirected to a file. You can do this by setting the ``NXF_ANSI_LOG`` environment
variable to ``false``:

.. code-block:: bash

  export NXF_ANSI_LOG='false'

Take a look at :ref:`environment-variables <nextflow_environment_variables>`
and :ref:`Configuring nextflow <configuring_nextflow>` sections of this guide
to see all the environment variables you can set in order to configure
your nextflow execution.

Terminating nextflow execution
------------------------------

If you need to terminate a nextflow execution, you can send a ``SIGTERM`` signal
for example with ``Ctrl+C``. This will terminate all running processes and will
turn off the pipeline execution removing the temporary *lock* files. If you require
to terminate a running process which nextflow can't terminate, you will need to
terminate such process manually, for example using ``scancel`` on a SLURM environment
or by killing such process if you are running nextflow with a local executor.

.. _running-nextflow-offline:

Running nextflow offline
------------------------

Nextflow can operate in environments without internet access by preparing all
necessary resources in advance. This includes the pipeline code, software dependencies,
reference genomes, and any required data.

You will require to download all necessary resources on a system with internet access,
and then transfer these resources to the offline system using available methods.
Moreover, you will need some extra steps in order to manage workflow properly.
To get more information on how to run nextflow offline, see the `Running offline
<https://nf-co.re/docs/usage/getting_started/offline>`_ nextflow documentation.

Set environment variables
^^^^^^^^^^^^^^^^^^^^^^^^^

You require to set the ``NXF_OFFLINE`` environment variable in order to
run nextflow offline:

.. code-block:: bash

  export NXF_OFFLINE='true'

This will tell nextflow to run in offline mode, disabling
all attempts to download resources from the internet: this include test files,
institutional configuration, software dependencies, reference genomes and plugins.
However, all those resources must be available when running nextflow. You can
find more information on :ref:`environment-variables` section of this guide and
in the official nextflow `Environment variables <https://www.nextflow.io/docs/
latest/reference/env-vars.html>`_ documentation.

Download the pipeline and its dependencies
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You can download a pipeline and its dependencies using the `nf-core tools <https
://nf-co.re/docs/nf-core-tools>`_ utility, using ``nf-core pipelines download``
command. For example, to download the ``rnaseq`` pipeline and its dependencies
you can use:

.. code-block:: bash

  nf-core pipelines download nf-core/rnaseq

The utility will ask you if you want to download the *singularity container images*
with the pipeline (usually yes) and if you want to *copy* singularity images
into the pipeline download folder or if you want to *amend* the singularity
images in the :ref:`$NXF_SINGULARITY_CACHEDIR <set-nxf-singularity-cache>` folder:
the latter should be choose if you are downloading the container images in a shared
folder that can be used during nextflow execution (ie. you are in a *login* node
in HPC infrastructure with internet access, while in the *computing* nodes there's
no internet access). Otherwise, you will require to copy all the downloaded files
in your final HPC infrastructure and putting container images where can be find
during execution (usually at ``$NXF_SINGULARITY_CACHEDIR`` location).
We have a section in this guide about setting up :ref:`nf-core tools <install-nf-core>`.

.. _clone-institutional-configuration-files:

Clone institutional configuration files
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The institutional configuration files should be cloned locally in order to be
used by the pipelines when running nextflow in offline mode.
Simple clone the repository in a local directory:

.. code-block:: bash

  git clone https://github.com/nf-core/configs.git

Usually, pipelines have statements which disable the use of institutional configurations when
running offline. For example, in the `nf-core/rnaseq <https://github.com/nf-core/rnaseq>`_
pipeline, you can find those statements in the ``nextflow.config`` file:

.. code-block:: groovy

  // Load nf-core custom profiles from different Institutions
  includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null"

  // Load nf-core/rnaseq custom profiles from different institutions.
  includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/pipeline/rnaseq.config" : "/dev/null"

Those include statements are completely ignored when ``NXF_OFFLINE`` is set to
``true``. In order to use institutional configuration files when running offline,
you should provide the path of these files with the ``-c`` or ``-config`` option
with the path of the full institutional configuration folder using the
``--custom_config_base`` option, for example:

.. code-block:: bash

  export CUSTOM_CONFIG_BASE=<path/to/institutional/configs>

  nextflow run nf-core/rnaseq -r 3.12.0 \
    --custom_config_base ${CUSTOM_CONFIG_BASE} \
    -config ${CUSTOM_CONFIG_BASE}/nfcore_custom.config \
    -config ${CUSTOM_CONFIG_BASE}/pipeline/rnaseq.config \
    -profile <institution> -resume -params-file <params-file>

This solution is pretty verbose, but it lets you to specify the desired profile
using the same syntax used when running nextflow with internet access.

.. warning::

  Not all the pipelines have the *pipeline specific* configuration file, like
  ``rnaseq.config`` in the previous example. Please check if this file exists
  in the pipeline repository before using it.

.. tip::

  You can also download a copy of institutional configuration files from using
  ``--download-configuration yes`` with ``nf-core pipelines download`` command.
  See the :ref:`download a pipeline <nf-core-pipelines-download>` with nf-core
  section of this guideline.

.. hint::

  At cnr-ibba we have a forked version of the nf-core/configs repository with
  custom options and profiles, which is available at https://github.com/cnr-ibba/nf-configs/.

Install nextflow plugins
^^^^^^^^^^^^^^^^^^^^^^^^

Nextflow plugins are required to run some pipelines but are downloaded and installed
when running the pipeline for the first time. Before running nextflow offline, you can
install them using the ``nextflow plugin install`` command, for example:

.. code-block:: bash

  nextflow plugin install nf-schema@2.3.0

This will install the ``nf-schema`` version ``2.3.0`` plugin in the nextflow
environment. You will required to inspect the pipeline ``nextflow.config`` file to see
which plugins are required by the pipeline and install them individually. If the
version of the plugin is not specified in the pipeline configuration file, you can
*pin* it in a *custom configuration* file, for example:

.. code-block:: groovy

  plugins {
    id 'nf-schema@2.3.0'
  }

This applies in an environment where you have internet access when installing
nextflow (for example a *login* node in a HPC environment). If you don't have
any internet connection in your environment, you should copy the ``${HOME}/.nextflow/plugins``
folder in your offline environment from a working environment.

Download reference genomes and other files
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You should download manually all the reference genomes and other files required
by the pipeline. If you plan to call the pipeline with the ``test`` profile, you
need to ensure that all the required files are present locally. Mind to the ``samplesheet.csv``
of the test profile, which is a *mandatory* input in most of the community pipelines:
usually it refers to file available on the internet, so you should download them
locally and modify the ``samplesheet.csv`` file accordingly. Then you should pass
the modified ``samplesheet.csv`` file to the pipeline using the proper CLI parameter
or using a JSON file with the ``-params-file`` option.