Illumina Mammals and birds
Description
Sequencing parameters:
- Platform: Illumina
- Read-length: paired-end 150bp or longer
- Targets: mammals, birds
Relevant publications:
- Development of a DNA metabarcoding method for the identification of fifteen mammalian and six poultry species in food
- Identification of Mammalian and Poultry Species in Food and Pet Food Samples Using 16S rDNA Metabarcoding
- Benchmarking and Validation of a Bioinformatics Workflow for Meat Species Identification Using 16S rDNA Metabarcoding
- Interlaboratory Validation of a DNA Metabarcoding Assay for Mammalian and Poultry Species to Detect Food Adulteration
- Detection of adulterated meat products by a next-generation sequencing-based metabarcoding analysis within the framework of the operation OPSON X: a cooperative project of the German National Reference Centre for Authentic Food (NRZ-Authent) and the competent German food control authorities
Official Methods:
- Amtliche Sammlung von Untersuchungsverfahren: BVL L 00.00-184 (German)
Run with:
--primer_set amniotes_dobrovolny
For example:
nextflow run bio-raum/FooDMe2 \
-r main \
-profile myprofile \ # (1)!
--input samples.tsv \
--primer_set amniotes_dobrovolny
- See the installation guide for more details on this parameter
Configuration
Following parameters are set to a different value than default when running this method:
database
: MIDORI`s LRNA (16S ribosomal subunit)max_expected_errors
: 2amplicon_min_length
: 65amplicon_max_length
: 110primers_fasta
:/assets/primers/amniotes_dobrovolny.fasta
taxid_filter
: 32524 (Amniotes)cutadapt_trim_3p
: true
Validation
Mammals and birds 16S Illumina metabarcoding, method from Dobrovolny paper with the dataset from the FooDMe1 paper.
For this method, a benchmarking (or validation) profile is provided in the FooDMe2 distribution:
Running this will fetch the dataset from ENA, run the workflow with the amniotes_dobrovolny
preconfiguration and then compare the resutls to the expected composition defined under assets/validation/dobrovolny_benchmark_groundtruth.csv
. A noise filter fo 0.1% of the total read number is applied to each sample and the composition is matched to up to the genus level.
In the resulting Excel file we can quickly count the number of TP, FP and FN and calculate precision and recall for the analysis:
- | Expect Positive | Expect negative |
---|---|---|
Predicted Positive | 524 | 31 |
Predicted Negative | 19 | - |
Which means a precision of 94.4% and recall of 96.5% out of the box.
However there are 14 FN occurences of fallow deer (Dama dama) in the table, which was not amplified in the initial method and therefore cannot be detected. These can be ignored for the validation.
Another problem in this dataset is that Kangaroo (Macropodidae), a family node, was expected with no information on the species, this results in a negative results in the benchmarking tools. We can correct this be converting all FP results for Kangaroo into FP (if a kangaroo speice was detected of course), these are all 17 occurences where either of Macropus giganteus, Osphranter robusuts, or Osphranter rufus were detected.
The corrected confusion table now looks like this:
- | Expect Positive | Expect negative |
---|---|---|
Predicted Positive | 541 | 14 |
Predicted Negative | 5 | - |
Now resulting in a precision of 97.4% and a recall of 99.1%.