DNA barcoding and metabarcoding are revolutionizing the study and survey of biodiversity. In order to assign taxonomic labels to the DNA sequence data retrieved, these methods are strongly dependent on… Click to show full abstract
DNA barcoding and metabarcoding are revolutionizing the study and survey of biodiversity. In order to assign taxonomic labels to the DNA sequence data retrieved, these methods are strongly dependent on comprehensive and accurate reference databases. Producing reliable databases linking biological sequences and taxonomic data can be—and often has been—done using mainstream tools such as spreadsheet software. However, spreadsheets quickly become insufficient when the amount of data increases to thousands of taxa and sequences to be matched, and validation operations become more complex and are error prone if done in a manual way. Thus, there is a clear need for providing scientists with user‐friendly, reliable and powerful tools to manipulate and manage DNA reference databases in tractable, sound and efficient ways. Here, we introduce the R package refdb as an environment for semi‐automatic and assisted construction of DNA reference libraries. The refdb package is a reference database manager offering a set of powerful functions to import, organize, clean, filter, audit and export the data. It is broadly applicable in metabarcoding data generally obtained in biodiversity and biomonitoring studies. We present the main features of the package and outline how refdb can speed up reference database generation, management and handling, and thus contribute to standardization and repeatability in barcoding and metabarcoding studies.
               
Click one of the above tabs to view related content.