Contact details
Dr Alexandra M Lewin
(Alex Lewin)
This is my official homepage. I also have a page on the BGX website (often more up to date), here.
I am an RCUK Research Fellow in the Biostatistics Group in the Department of Epidemiology and Public Health at Imperial. I am also a member of the BGX collaboration between our department, the Statistics group in Bristol and the Imperial College Microarray Centre, which is developing flexible Bayesian models for gene expression (microarray) data.
Prior to my work in genomics I worked on the Landfill project as part of SAHSU, analysing cancer risks in populations living near landfill sites, and developing a Bayesian hierarchical model for the analysis of risk of birth defects around landfill sites.
For my PhD (in the Theoretical Physics Group at Imperial) I worked with Andy Albrecht and Joao Magueijo on detecting non-Gaussianity in the cosmic microwave background, and on the observability of oscillations in the microwave background and large scale structure power spectra. Whilst visiting Andy Albrecht's Cosmology Group at UC Davis as part of my PhD I worked with the Supernovae Cosmology Project in Berkeley comparing different methods of analysing Type Ia supernovae light curves.
Current Research
- Bayesian models for association between genetic markers and gene expression
- Bayesian analysis of Affymetrix Exon Arrays and detection of splice variants
- Prediction for protein interaction networks
- Bayesian hierachical models for differential gene expression
- Exact tests for categorical genetic data
- Multiple testing
- Predictive checks for Baysian models
See below for Publications, Software and Presentations.
Publications
Papers in refereed journals:
Kulinskaya, E. and Lewin, A. (2009) Testing for linkage and Hardy-Weinberg disequilibrium. Annals of Human Genetics, 73(2):253 - 262. journal page LDtests software
Kulinskaya, E. and Lewin, A. (2009) On fuzzy familywise error rate and false discovery rate procedures for discrete distributions. Biometrika 96(1):201-211. journal page fuzzyFDR software
Lewin, A., Bochkina, N. and Richardson, S (2007) Fully Bayesian mixture model for differential gene expression: simulations and model checks. Statistical Applications in Genetics and Molecular Biology Vol. 6 : Iss. 1, Article 36. journal page BGmix software
Lewin, A., Richardson, S., Marshall C., Glazier A. and Aitman T. (2006) Bayesian Modelling of Differential Gene Expression. Biometrics 2006 62: 10-18 journal page BayesDE software
Lewin, A. and Grieve, I. (2006). Grouping Gene Ontology terms to improve the assessment of gene set enrichment in microarray data. BMC Bioinformatics 2006 7:426 journal page PoGO software
Broët, P., Lewin, A., Richardson, S., Dalmasso, C. and Magdelenat, H. (2004) A mixture model based strategy for selecting sets of genes in multiclass response microarray experiments. Bioinformatics 2004 20(16):2562-2571; doi:10.1093/bioinformatics/bth285. journal page gmix software
Jarup, L., Briggs, D., de Hoogh, C., Morris, S., Hurt, C., Lewin, A., Maitland, I., Richardson, S., Wakefield J. and Elliott P. (2002) Cancer risks in populations living near landfill sites in Great Britain. British Journal of Cancer 86, 1732-1736 journal page
Lewin, A. and Albrecht, A. (2001) Can inflationary models of cosmic perturbations evade the secondary oscillation test? Physical Review D 64 023514 ps/pdf at arXiv
Lewin, A., Albrecht, A. and Magueijo, J. (1999) A new statistic for picking out Non-Gaussianity in the CMB. Monthly Notes of the Royal Astronomical Society 302 131-138 ps/pdf at arXiv
Book chapters, proceedings, discussions:
Bochkina N. and Lewin A. (2009/10). Classification for differential expression using Bayesian hierarchical models. Chapter in "Bayesian Modeling in Bioinformatics", eds. Dipak K. Dey, Samiran Ghosh, Bani K. Mallick (Chapman & Hall/CRC).
Lewin A. and Richardson, S. (2007). Bayesian methods for microarray data. Chapter in "Handbook of Statistical Genetics, 3rd edition", eds. David Balding, Martin Bishop and Chris Cannings (Chichester: Wiley).
Lewin A. and Richardson, S. (2007). Discussion of `Model-based clustering for social networks' by Mark Handcock, Adrian Raftery and Jeremy Tantrum, Journal of the Royal Statistical Society A, 170, 301-354.
Lewin A. and Richardson, S. (2007). Discussion of `FDR and Bayesian Multiple Comparison Rules' by Peter Mueller, Giovanni Parmigiani and Kenneth Rice. Contribution to Bayesian Statistics 8, eds. Susie Bayarri, James O. Berger, Jose M. Bernardo, A. Philip Dawid, David Heckerman, Adrian F. M. Smith and Mike West (Oxford University Press).
Hein, A.-M.K., Lewin A. and Richardson, S. (2006). Bayesian Hierarchical Models for Inference in Microarray Data. Chapter in "Bayesian Inference for Gene Expression and Proteomics", eds. Peter Mueller and Marina Vanucci (Cambridge University Press).
Magueijo, J. and Lewin, A. (1997). Non-Gaussian spectra and the search for cosmic strings. Contribution to the proceedings of "Topological defects and CMB", Rome, October 96 ps/pdf at arXiv
Software
BGmix: mixture model for differential expression
Bayesian mixture model for differential gene expression. Three components model non-differentially expressed, over and under-expressed genes separately. A number of parametric choices are available, along with predictive model checks.
This is the code for the model in Lewin et al. (2007), Stat. Appl. Gen. Mol. Biol..
LDtests: An R package providing several exact tests of Linkage Disequilibrium and Hardy-Weinberg Equilibrium.
Several exact 2-sided and 1-sided tests for LD and HWE are provided, including tests using conditional p-values proposed in Kulinskaya (2008) to overcome the problems of asymetric distributions.
This is the code for the tests used in Kulinskaya and Lewin (2008)..
fuzzyFDR: An R package to find fuzzy decision rules for multiple testing of hypotheses with discrete data.
Exact calculation of fuzzy decision rules for multiple testing. Choose to control FDR (false discovery rate) using the Benjamini and Hochberg method, or FWER (family wise error rate) using the Bonferroni method.
This is the code for the model in Kulinskaya and Lewin (2007)..
BayesDE: WinBUGS code for differential gene expression
This code can be used to find differential gene expression between two experimental conditions, using a Bayesian hierarichal model. The prior on the log fold changes between the conditions is unstructured, meaning that the genes are not grouped, but ranked. Non-linear normalization between arrays is included in the model.
This is the code for the model in Lewin et al. 2006, Biometrics.
gmix: a semi-parametric mixture model
This model is a fully Bayesian Normal mixture model with variable number of components (programmed in Fortran using reversible jump MCMC). One mixture component has mixed mean (and possibly variance), allowing 'hypothesis testing' between a Normal null and an unknown alternative (modelled semi-parametrically using the rest of the mixture components).
This is the code for the model in Broët et al. 2004, Bioinformatics.
PoGO: Gene Ontology and differential expression
Software for finding statistically over-represented groups of Gene Ontology categories in microarray experiments.
This software was used for Lewin and Grieve 2006, BMC Bioinformatics.
Presentations
Statistical Testing of Gene Enrichment, and fuzzy decision rules for multiple testing
(Lewin and Grieve 2006, BMC Bioinformatics; Kulinskaya and Lewin, submitted)
-
Talk given at the 25th Leeds Annual Statistical Research Workshop, Leeds July 2006. ppt
Bayesian Mixture Model for Differential Gene Expression
and predictive model checks
(Lewin et al. 2007, Stat. Appl. Gen. Mol. Biol.)
-
Talk given at the BIRS workshop "Bioinformatics, Genetics and Stochastic Computation: Bridging the Gap", Banff, July 2007. ppt
-
Poster given at the Valencia / ISBA Eighth World Meeting on Bayesian Statistics, Benidorm, June 2006. ppt
-
Talk given at the European Statistician's Meeting, Oslo, July 2005. ppt
Bayesian Hierarchical Model for Differential Gene Expression (unstructured priors)
(Lewin et al. 2006, Biometrics; Lewin et al. 2007, Stat. Appl. Gen. Mol. Biol.)
-
Talk given in the Biometrics Journal Showcase Session, IBS Channel Network Conference, Rolduc, Netherlands, May 2007. ppt
-
Lecture given at the INSERM Workshop "Biostatistical Modelling of Postgenomic Data for Biomedical Research", Toulon, October 2005. ppt
Semi-parametric Mixture Model for Gene Expression Profiles
and Bayesian estimation of the False Discovery Rate
(Broët et al. 2004, Bioinformatics; Lewin et al. 2006, Biometrics)
-
Poster presented at 'Mathematical and Statistical Aspects of Molecular Biology XIV', Cambridge, March 2004 and 'Statistics in Functional Genomics' Workshop, Ascona July 2004. ppt
-
Seminar given in the School of Applied Statistics, Reading, February 2004 and in the Institut Pasteur in Lille, March 2004. pdf ppt
Multiple Testing:
Seminar given as part of the Statistical Advisory Service Seminar Series, Imperial College, March 2005. pdf
- Bonferroni correction
- False discovery rate / positive false discovery rate
- Controlling error rates, step-wise procedures
- Benjamini and Hochberg, Storey methods
- FDR in Bayesian framework
Tutorial in R:
This is a beginner's practical tutorial in the statistical software R, using epidemiological type data (part of the MSc in Public Health given in the Dept. of Epidemiology and Public Health, November 2007). Includes basic exploratory analysis, hypothesis testing and confidence intervals. pdf
The two data sets needed for the tutorial can be found here:
Data on blood pressure
ONS smoking survey (module 5707)


