Back to

main Publications Talks Graduate Students Teaching Links

**Research Interests:**

I
am generally interested in the development of application of statistical
methods to scientific problems and the development of general statistical
methodologies driven by these applications. Currently I have three major
distinct but also related research areas. The first is in data mining and
information theory related area, the second is in molecular biology and
evolutionary genetics, the third is in the statistical analysis of metagenomic data.

1.
**Data mining related topics:**

There are several main themes along this line of my research.

1.
Multivariate data exploratory methods, data
reduction and model interpretation. In particular, prototype methods.

2. Large p small n problems, for both supervised and unsupervised learning; Related to this, I am interested in sufficient dimension reduction and inverse regression.

3. Classification problems for very high number of classes.

4. Rare Target Identification in Drug Discovery.

2.
**Statistical methods in molecular evolution: **

My interests in this direction
include:

1.
Statistical methods to predict the structure or
functions of genes.

2.
Statistical methods for detecting adaptive molecular
evolution.

3.
Inference and diagnostics in phylogeny.

4. Improving stochastic models of protein evolution. (Developing a general and flexible codon model framework to incorporate the structure information of genes and further development of such models to genome analysis).

**3. ****Metagenomic**** Analysis:**

I am currently working on problems:

1. Modeling the association between host genome and metagenome and their interactions.

2. Modelling the joint influence of host genome, metagenome and environmental variables to the disease states.

3. Developing data reduction methods and supervised learning methods to interpret the metagenomic data based on NMF.

4. Develop a set of approaches to calculate microbial beta diversity based on the joint distributions of species.

5. Continue developing the supervised version of the hierarchical Bayesian models BioMiCo (A Bayesian model for inference of metabolic divergence among microbial communities).