MiDAS 4: A global catalogue of full-length 16S rRNA gene sequences and taxonomy for studies of bacterial communities in wastewater treatment plants
Morten Kam Dahl Dueholm, Marta Nierychlo, Kasper Skytte Andersen, Vibeke Rudkjøbing, Simon Knutsson, Sonia Arriaga, Rune Bakke, Nico Boon, Faizal Bux, Magnus Christensson, Adeline Seak May Chua, Thomas P. Curtis, Eddie Cytryn, Leonardo Erijman, Claudia Etchebehere, Despo Fatta-Kassinos, Dominic Frigon, Maria Carolina Garcia-Chaves, April Z. Gu, Harald Horn, David Jenkins, Norbert Kreuzinger, Sheena Kumari, Ana Lanham, Yingyu Law, TorOve Leiknes, Eberhard Morgenroth, Adam Muszyński, Steve Petrovski, Maite Pijuan, Suraj Babu Pillai, Maria A. M. Reis, Qi Rong, Simona Rossetti, Robert Seviour, Nick Tooker, Pirjo Vainio, Mark van Loosdrecht, R. Vikraman, Jiří Wanner, David Weissbrodt, Xianghua Wen, Tong Zhang, Mads Albertsen, P. Nielsen
Microbial communities are responsible for biological wastewater treatment, but our knowledge of their diversity and function is still poor. Here, we sequence more than 5 million high-quality, full-length 16S rRNA gene sequences from 740 wastewater treatment plants (WWTPs) across the world and use the sequences to construct the ‘MiDAS 4’ database. MiDAS 4 is an amplicon sequence variant resolved, full-length 16S rRNA gene reference database with a comprehensive taxonomy from domain to species level for all sequences. We use an independent dataset (269 WWTPs) to show that MiDAS 4, compared to commonly used universal reference databases, provides a better coverage for WWTP bacteria and an improved rate of genus and species level classification. Taking advantage of MiDAS 4, we carry out an amplicon-based, global-scale microbial community profiling of activated sludge plants using two common sets of primers targeting regions of the 16S rRNA gene, revealing how environmental conditions and biogeography shape the activated sludge microbiota. We also identify core and conditionally rare or abundant taxa, encompassing 966 genera and 1530 species that represent approximately 80% and 50% of the accumulated read abundance, respectively. Finally, we show that for well-studied functional guilds, such as nitrifiers or polyphosphate-accumulating organisms, the same genera are prevalent worldwide, with only a few abundant species in each genus.