Skip to content

Outputs

Reports

  • name_of_pipeline_run.xlsx: A table with accumulated results - one row per sample per taxon:
Sample Taxon Percentage Reads Cluster IDs
SampleA Sus scrofa 75.0 34545 ASV_1[1.0]
SampleA Bos taurus 25.0 11515 ASV_2[0.9]

The Cluster IDs list one or several ASVs that yielded this taxon assignment, including the fraction of high-scoring database hits that support this call.

This is further detailed on the second work sheet:

Sample Cluster ID Size Taxon Ranks Taxid Support[%] Consensus
SampleA ASV_1 34545 Sus scrofa species 9823 100 TRUE
SampleA ASV_2 11515 Bos taurus species 9913 90 TRUE

Of note is the column Consensus, which shows wether this call was in fact chosen as the consensus call for this ASV. Some ASVs may have multiple possible taxonomic assignments, with differing percentages. Discarded calls are listed with a Consensus of FALSE.

  • name_of_pipeline_run_krona.html: A multi-sample Krona report to visualize taxonomic composition of samples.

krona

The Krona report displays the taxonomic composition of a given as a circular plot, divided into levels of taxonomy.

  • name_of_pipeline_run.html: A graphical and interactive report of various QC steps and results

summary

The summary report contains a variety of information about QC measures as well as the final results. Some of the key metrics are shown at the top, in the section labelled Summary. The status column aims to highlight the overall quality of a given sample, although users are advised to develop their own relevant cutoff metrics for the various processing stages, such as a minimum number of reads required after clustering for a sample to be considered for analysis, or the maximum amount of chimeric reads that are tolerated for a particular application.

Per-sample outputs

Sample-level reports can be found in the folder results/samples/sample_id.

When using Vsearch for OTU clustering

  • vsearch/sample_id.usearch_global.tsv: the number of reads mapping against each respective OTU, per sample
  • vsearch/sample_id.precluster.fasta: the final set of OTUs in FASTA format

When using DADA2 for OTU/ASV clustering

  • DADA2/sample_id_ASVs.fasta: the clustered sequences (OTU/ASV)

This folder also contains a number of additional metrics and outputs, including graphical summaries of the error profiles and intermediary sequences tables that can be used to debug sample-specific issues.

  • blast/sample_id.filtered.json: JSON listing of all BLAST hits
  • blast/sample_id.consensus.json: JSON listing of consensus taxa assigned to each sequence cluster

This folder contains some of the raw sample-level outputs.

  • report/sample_id.composition.tsv: the taxonomic composition of this sample in TSV format.
  • report/sample_id.composition.json: the taxonomic composition of this sample in JSON format.
  • report/sample_id.blast_stats.tsv: Details of the blast matches against each respective OTU.
  • report/sample_id.summary.json: A JSON summary of the results and QC for this sample

The file sample_id.summary.json contains the final results and forms the basis for the HTML report. If you wish to roll your own report or feed results automatically into a e.g. database, this is where you should start.

Pipeline run metrics

This folder contains the pipeline run metrics

  • params_TIMESTAMP.json: A summary of all pipeline parameters in JSON Format (with timestamp of pipeline execution)
  • pipeline_dag.svg: the workflow graph (only available if GraphViz is installed)
  • pipeline_report.html: the (graphical) summary of all completed tasks and their resource usage
  • pipeline_report.txt: a short summary of this analysis run in text format
  • pipeline_timeline.htm: chronological report of compute tasks and their duration
  • pipeline_trace.txt: Detailed trace log of all processes and their various metrics