Daniel Garza

reasearch documents

 

   

PhD thesis

 
Exploring microbial ecology and evolution with genome scale metabolic models
   
Abstract

Genomics and metagenomics have become the core scientific tools to investigate patterns in the evolution and ecology of microbes. The genetic content revealed by sequencing provides us with invaluable information about the identity, abundance, diversity, and distribution of microbes in different ecosystems. An important feature that a microbial genome reveals is the set of genome-encoded metabolic reactions. Based on these reactions, one can reconstruct the biochemical landscape of a microorganism and identify the pathways by which environmental metabolites are imported and used. Microorganisms obtain free energy to perform their functions and matter to build their biomass from their metabolic reactions. Biochemical reaction networks reconstructed from genomes are termed genome-scale metabolic models (GSMMs). In this thesis, I used GSMMs combined with three different computational frameworks to, respectively, predict and describe three important patterns of microbial systems. The investigated patterns were: (i) the frequency distribution of genes in pan-genomes (Chapter 2); (ii) metagenomic signatures in human colorectal cancer (CRC) (Chapters 3 and 4), and (iii) the species abundance distribution in the metagenomes of the human microbiome (Chapter 5). GSMMs were used as the basic building blocks to explain these patterns and were integrated with the composition of the external environment into frameworks that mechanistically connect the patterns, the genome-encoded metabolic reactions, and the external environment in meaningful ways. Overall, this thesis provides novel tools and frameworks to model and explain microbial systems starting from DNA sequences.

Chapter 1.1 contains a general introduction to some of the important patterns of metabolism and explains the three patterns of microbial systems that are listed above. Chapter 1.2 is a second introductory chapter where most of the microbial systems that were used in the following studies are reviewed in detail, including cultured and uncultured microorganisms, microbial genomics, metagenomics, patterns in microbial assembly, and the development of mechanistic models of microbial systems. Chapter 2 reports an investigation of the patterns in the dynamics and composition of pan-genomes. In this chapter, metabolic reactions were used as functional proxies for genes and a framework that mechanistically assesses the major drivers of gene frequency in pan-genomes was developed. Chapters 3 and 4 report investigations of the patterns found in metagenomic signatures of human colorectal cancer (CRC). First experimentally (chapter 3), the potential effect of secreted bacterial molecules and surface proteins in CRC cells are assessed and associated to bacterial genomes. Next, a framework (chapter 4) that associates metabolites enriched in CRC with the bacteria that are also found to be enriched in CRC metagenomes is described. Chapter 5 reports an investigation of the patterns in species abundance distribution in microbiomes. A framework that predicts the environmental metabolomes from the association of growth rates predicted from GSMMs and the species abundance distribution measured by metagenomics is described. Chapter 6, concludes the thesis by discussing how the chapters are integrated and identifying their main limitations. We conclude by summarizing important future steps for the development of general, unified, predictive, and informative models of microbial systems.

 

Cover

 

Main text

 

Supplementary material:

Supplementary File Description
Chapter 2

Table S2.1

Reactions in the toy example. List of toy reactions used in the in silico simulations of the toy model displayed in Figure 2.1.

Table S2.2

Bacterial and archaeal strains used in this study

Table S2.3

Environment ball

Table S2.4

Environment-driven reaction scores

Table S2.5

Elastic net predictions of the metabolite usage of 46 prokaryote families

Table S2.6

Correlation of variables related to FIRS, pan-reactomes, and metagenomes
Chapter 3

Table S3.1

Bacterial strains used in this study

Table S3.2

Cancer mutational profile of six human cell lines used in this study

Table S3.3

Growth rate scores, z-scores, and p-values measured from human cells incubated with bacterial cells. These values were computed from the average of four experimental replicates (see “cell growth analysis” in the methods section)

Table S3.4

Growth rate scores, z-scores, and p-values measured from human cells incubated with bacterial secretomes.These values were computed from the average of four experimental replicates (see “cell growth analysis” in the methods section)

Table S3.5

Literature summary of microbial virulence factors potentially associated to cancer

Table S3.6

Correlation between growth rate scores of cells and secretomes.The correlation values were obtained from the growth rate scores computed from the average of four experimental replicates of bacterial cells and secretomes for the group of strains that belong to the indicated bacterial family

Table S3.7

Statistical significance analysis of family-specific clustering of the growth rate scores

Table S3.8

Correlation between the pairwise phylogenetic distance between bacterial strains used in this study and the pairwise Euclidean distance of the growth rate scores

Table S3.9

Distribution of genes coding for virulence factors in the genomes of the bacteria used in this study. The TcdA toxin was present in bacteria of the Clostridiales order while other toxins were present within bacterial families

Table S3.10

Functional genomic terms significantly associated to growth rate scores within bacterial families
Chapter 4

Table S4.1

MAMBO, Western diet, and high-fiber diet basal environment

Table S4.2

MI, SGA, MR scores, CRC enrichment p-values, AUC, and mOTU prediction for all GSMMs

Table S4.3

Important metabolites for GSMMs
Chapter 5

Table S5.1

Top 20 metabolites for human skin predicted by averaging the normalized results of MAMBO on 50 human skin metagenomes. Myristic acid is used as a fragrance ingredient, cleansing agent, and emulsifier, and is readily adsorbed by the skin. Citrate is a commonly used ingredient to adjust the acidity of cosmetics. Nicotinamide ribonucleotide, aspartate, and N-acetyl glucosamine are used in skin conditioner products. N-acetyl glucosamine is a precursor to hyaluronic acid, a major component of skin structure, a pathway that responds to UV irradiation in skin37. Complete lists of all predicted metabolomes for 175 metagenomes are provided in Supplementary File 1

Table S5.2

Metabolomic profiles predicted by MAMBO and genes-only approach based on 37 oral, 50 skin, 39 stool and 49 vaginal metagenomes, and 6 experimentally measured metabolomic profiles. Values are normalised predicted abundances

Table S5.3

Pearson correlations between 6 measured metabolomic profiles and 175 predicted metabolomic profiles by MAMBO and genes-only approach. Correlations are only shown if >5 metabolites of the predicted metabolites were measured and vice versa

Masters thesis

 
Lysogeny and ecotypic diversification in the recent evolution of Vibrio cholerae
 

Main text