Understanding the human gut microbiota at a finer resolution is crucial for uncovering how specific bacteria impact health and disease. Now, researchers have developed an extensive catalog of human gut microbes, which helped them to uncover functional differences linked to diseases such as colorectal cancer.
The work, published in Cell Host & Microbe, suggests that profiling the human gut microbiota at a finer level provides more reproducible insights into how specific bacteria influence health and disease.
So far, most studies of gut bacteria have only looked at the species level, which is too broad to see important differences, while looking at individual strains is too detailed and inconsistent. A better solution is focusing on subspecies—groups within a species that have unique functions—as it captures meaningful genetic and functional differences within species while remaining stable across studies.
Matija Trickovic at the University of Geneva in Switzerland and his colleagues set out to create a comprehensive reference of human gut bacteria at the subspecies level, which they called the HuMSub catalog.
Subspecies level
To build the HuMSub catalog, the researchers filtered and clustered nearly 226,000 gut bacterial genomes into species and subspecies groups. Unlike older methods that merge different strains together, the team focused on differences in coding sequences to reliably separate bacteria into groups that share distinct biological traits. In total, they identified more than 5,300 subspecies across 977 species.
To test whether subspecies could provide a balance between detail and generalizability, the researchers analyzed more than 5,000 human gut microbiota samples from around the world. Most subspecies—about 62%—were shared across multiple continents, while only a small fraction was absent. For example, one subspecies of Anaerostipes hadrus was common in parts of Europe and Korea but almost absent in Canada, India, and Madagascar.
Some regions, especially in Africa, showed unique subspecies linked to lifestyle differences. These findings show that subspecies-level analysis captures important patterns that are missed at both species and strain levels, offering a powerful way to study the microbiota across diverse populations, the authors say.
Unprecedented depth
Next, the researchers analyzed gut microbiota data from seven studies on colorectal cancer, which included more than 1,000 people. About 200 subspecies were associated with colorectal cancer, and in some cases, one subspecies of a microbe was strongly linked to cancer, while its close sibling subspecies was not. This included certain subspecies of Fusobacterium animalis and Porphyromonas asaccharolytica.
Finally, the team trained machine learning models to distinguish colorectal cancer patients from healthy people. Although traditional species-level models performed well, subspecies-level models outperformed them, achieving higher accuracy and better reliability. The predictions improved further when combined with other clinical tests, such as fecal blood tests.
“Dwelling on the subspecies differences, the HuMSub catalog enables discovery of new mechanistic insights into the interplay between the microbiota and host phenotype and sets the ground for analyses of new and reanalyzes of existing datasets at an unprecedented depth,” the authors say.