Installation
If you are new to our pipeline ecosystem, we recommend you first check out our general setup guide here.
Installing nextflow
Nextflow is a highly portable pipeline engine. Please see the official installation guide to learn how to set it up.
This pipeline expects Nextflow version 24.04.4, available here.
Software provisioning
This pipeline is set up to work with a range of software provisioning technologies - no need to manually install packages!
You can choose one of the following options:
The pipeline comes with simple pre-set profiles for all of these as described below; if you plan to use this pipeline regularly, consider adding your own custom profile to our central repository to better leverage your available resources. Advantages of using custom profile include:
- possibility to cache environements and containers
- control ressources usage
- use additional container/package managers not pre-configured in FooDMe2, as described here.
Select the appropriate profile with the -profile
argument:
Installing the references
This pipeline requires locally stored references from Midori, UNITE and NCBI respectively. To build these, do:
where /path/to/references
could be something like /data/pipelines/references
or whatever is most appropriate on your system.
The path specified with --reference_base
can then be given to the pipeline during normal execution as --reference_base
.
Please note that the build process will create a pipeline-specific subfolder (foodme2
) that must not be given as part of the --reference_base
argument. FooDMe2 is part of a collection of pipelines that use a shared reference directory and it will choose the appropriate subfolder by itself.
Warning
In either case, this will download and format the various databases available through this pipeline. Please note that one of these databases is the full GenBank core_nt database, which has a final size of over 250GB (and growing), and will need around 0.5TB during installation. If your application works with single gene databases, you can skip installing this database with --skip_genbank
.
Site-specific config file
If you run on anything other than a local system, this pipeline requires a site-specific configuration file to be able to talk to your cluster or compute infrastructure. Nextflow supports a wide range of such infrastructures, including Slurm, LSF and SGE - but also Kubernetes and AWS. For more information, see here. In addition, a site-specific config file allows you to pre-set certain options specifically for your system and removes some of the complexity of the command line calls.
Site-specific config-files for our pipeline ecosystem are stored centrally on github. Please talk to us if you want to add your system.
Local configuration file
If you do not wish to use a site specific configuration, it is also possible to use a local configuration file.
This possibility is however limited to defining nextflow execution parameters and does not allow to define workflow-specific parameters such as --reference_base
.
See the nextflow documentation for more informations on local configuration.