Coming up short: Identifying substrate and geographic biases in fungal sequence databases

Maryia Khomich, Filipa Cox, Carrie J. Andrew, Tom Andersen, Håvard Kauserud, Marie L. Davey

    Research output: Contribution to journalArticlepeer-review

    73 Downloads (Pure)

    Abstract

    Insufficient reference database coverage is a widely recognized limitation of molecular ecology approaches which are reliant on database matches for assignment of function or identity. Here, we use data from 65 amplicon high-throughput sequencing (HTS) datasets targeting the internal transcribed spacer (ITS) region of fungal rDNA to identify substrates and geographic areas whose underrepresentation in the available reference databases could have meaningful impact on our ability to draw ecological conclusions. A total of 14 different substrates were investigated. Database representation was particularly poor for the fungal communities found in aquatic (freshwater and marine) and soil ecosystems. Aquatic ecosystems are identified as priority targets for the recovery of novel fungal lineages. A subset of the data representing soil samples with global distribution were used to identify geographic locations and terrestrial biomes with poor database representation. Database coverage was especially poor in tropical, subtropical, and Antarctic latitudes, and the Amazon, Southeast Asia, Australasia, and the Indian subcontinent are identified as priority areas for improving database coverage in fungi.
    Original languageEnglish
    Pages (from-to)75-80
    Number of pages5
    JournalFungal Ecology
    Volume36
    Early online date22 Sept 2018
    DOIs
    Publication statusPublished - Dec 2018

    Fingerprint

    Dive into the research topics of 'Coming up short: Identifying substrate and geographic biases in fungal sequence databases'. Together they form a unique fingerprint.

    Cite this