Abstract
|
The advent of high-throughput sequencing technologies has
greatly promoted the field of metagenomics, which studies the genetic
materials of the entire microbial communities without the need of
separating and culturing the individual member organisms. In the past
few years, the number of metagenomic projects in natural (such as soil,
water) or host-associated (such as human gut) environments has grown
exponentially. These projects offer researchers unprecedented opportunities in many disciplines including ecology, environmental sciences, and biomedicine.
The massive sequencing data also brings great challenges in statistics and computation. Characterization of taxonomic composition of a metagenomic sample is essential for understanding the structure of the microbial community.
Here, we propose a mixture model to identify the multiple genomes present in a metagenomic sample and to estimate their relative abundance. The method is comprehensively tested on simulated data and real data and is able to accurately identify and quantify multiple genomes in a metagenomic sample.
Current statistical and computational methods that are being developed to analyze the metagnoimcs data and the challenges will also be highlighted in the talk.
|