Back to

main Publications Talks Graduate Students Teaching Links

 

My research focuses on the development of statistical and machine learning methodologies for high-dimensional biological data, with key contributions in the following areas:


1. Statistical Modeling, Inference, and Multisource Omics Data

·       Develop methods for integrating heterogeneous omics data (genomics, microbiome, clinical)

·       Address challenges of high dimensionality, noise, and measurement error

·       Contributions include:

o   Deconvolution methods for measurement error correction

o   Non-negative matrix factorization (NMF) for latent structure discovery

o   Cross-study integration methods for combining multiple datasets


2. Computational Molecular Evolution

·       Develop statistical models for evolutionary processes and phylogenetics

·       Key contributions:

o   Codon substitution models and likelihood-based clustering (LiBaC)

o   Methods for testing model adequacy and selection pressure inference

o   Gene coevolution and phylogenetic modelling


3. Dimension Reduction, Variable Selection, and FDR Control

·       Advance methods for high-dimensional data analysis

·       Contributions include:

o   Dimension reduction (e.g., Poisson PCA, interpretable PCA)

o   Sparse variable selection methods (e.g., SuRF)

o   False discovery rate (FDR) control, including hierarchical and structured approaches

·       Emphasis on scalability, theoretical guarantees, and interpretability


4. Machine Learning Methods and Applications in Medicine

·       Develop machine learning and AI methods for healthcare applications

·       Focus areas:

o   Neural network-based feature and structure selection

o   Clinical decision support systems and predictive modelling

o   Applications in medical imaging, diagnostics, and emergency medicine

·       Emphasis on interpretable and reliable AI models


5. Temporal Dynamics of the Microbiome

·       Model time-evolving microbial systems using statistical and stochastic approaches

·       Contributions include:

o   Stochastic differential equation models (e.g., Ornstein–Uhlenbeck processes)

o   Optimal sampling design for time-series data

o   Integration with dimension reduction and variable selection methods

·       Applications to understanding microbial community dynamics and health outcomes


Overall Research Vision

·       Integrate statistical theory and machine learning into unified frameworks

·       Develop robust, interpretable, and scalable methods for complex biological data

·       Advance applications in biomedical research, systems biology, and precision medicine