A Study of Convolution Models for Background Correction of BeadArrays
The robust multi-array average (RMA), since its introduction in Irizarry, Bolstad,
Collin, Cope, Hobbs, and Speed (2003a); Irizarry, Hobbs, Collin, Beazer-Barclay, An-
tonellis, Scherf, and Speed (2003b); Irizarry, Wu, and Jaee (2006), has gained popularity
among bioinformaticians. It has evolved from the exponential-normal convolution to the
gamma-normal convolution, from single to two channels and from the Aymetrix to the
The Illumina design provides two probe types: the regular and the control probes.
This design is very suitable for studying the probability distribution of both and one can
apply a convolution model to compute the true intensity estimator.
In this paper, we study the existing convolution models for background correction of
Illumina BeadArrays in the literature and give a new estimator for the true intensity,
assuming that the intensity value is exponentially or gamma distributed and the noise has
Our study shows that one of our proposed models, the gamma-lognormal with the
method of moments for parameters estimation, is the optimal one for the benchmark-
ing data set with benchmarking criteria, while the gamma-normal model has the best
performance for the benchmarking data set with simulation criteria.
For the publicly available data sets, the gamma-normal and the exponential-gamma
models with maximum likelihood estimation method can not be used and our proposed
models exponential-lognormal and gamma-lognormal have the best performance, showing
a moderate error in background correction and in the parametrization.
Allen JD, Chen M, Xie Y (2009). "Model-Based Background Correction (MBCB): R Methods
and GUI for Illumina Bead-array Data." Journal of Cancer Science and Therapy, 1(1), 25-27.
Baek J, Son YS, MacLachlan GJ (2007). "Segmentation and intensity estimation of microarray
images using a gamma-t mixture mode." Bioinformatics, 23(4), 458-465.
Bolstad BM (2004). "Low Level Analysis of High-density Oligonucleotide Array Data: Background, Normalization and Summarization. Ph.D. thesis, University of California, California, Berkeley.
Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003). A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance." Bioinformatics, 19(2), 185-193.
Chen M, Xie Y, Story MD (2011). "An Exponential-Gamma Convolution Model for Background Correction of Illumina BeadArray Data." Communication in Statistics: theory and methods, 40(17), 3055-3069.
Cope LM, Irizarry RA, Jaee HA, Wu Z, Speed TP (2004). "A benchmark for Aymetrix GeneChip expression measures." Bioinformatics, 20, 323-331.
Ding LH, Xie Y, Park S, Xiao G, Story MD (2008). "Enhanced identication and biological validation of dierential gene expression via Illumina whole-genome expression arrays through the use of the model-based background correction methodology." Nucleic Acids
Dunning MJ, Barbosa-Morais NL, Lynch AG, Tavare S, Ritchie ME (2008). "Statistical issues in the analysis of Illumina data." BMC Bioinformatics, 9-85.
Forcheh AC, Verbeke G, Kasim A, Lin D, Shkedy Z, Talloen W, Gohlmann HW, Clement L
(2012). "Gene Filtering in the Analysis of Illumina Microarrays Experiments." Statistical Applications in Genetics and Molecular Biology, 11(2).
Hochreiter S, Djork-Arne, Obermayer K (2006). "A new summarization method for aymetrix probe level data." Bioinformatics, 22(8), 943-949.
Huber W, Irizarry RA, Gentleman R (2005a). Bioinformatics and Computational Biology
Solutions Using R and Bioconductor, chapter Preprocessing Overview. Springer.
Huber W, von Heydebreck A, Vingron M (2004). "Error models for microarray intensities."
Technical Report Paper 6, Bioconductor Project Working Papers.
Huber W, von Heydebreck A, Vingron M (2005b). "An introduction to low-level analysis
methods of DNA microarray data." Technical Report Paper 9, Bioconductor Project Working Papers.
Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP (2003a)."Summaries of
Afymetrix GeneChip probe level data." Nucleic Acids Research, 31(4).
Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP
(2003b). "Exploration, Normalization and Summaries of High Density Oligonucleotide Array Probe Level Data." Biostatistics, 4(2), 249-264.
Irizarry RA, Wu Z, Jaee HA (2006). "Comparison of Aymetrix geneChip expression measures." Bioinformatics, 22(7), 789-794.
Li C, Wong WH (2001). "Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection." Proceeding national Academy of Sciences, 98(1), 31-36.
Plancade S, Rozenholc Y, Lund E (2011). "Improving background correction for Illumina
BeadArrays: the normal-gamma model."
Plancade S, Rozenholc Y, Lund E (2012). "Generalization of the normal-exponential model:
exploration of a more accurate parameterisation for the signal distribution on Illumina
BeadArrays." BMC Bioinformatics, 13(329).
Shi W, Oshlack A, Smyth GK (2010). "Optimizing the noise versus bias trade-o for Illumina
whole enome expression Beadchips." Nucleic Acids Research, 38(22: e204).
Silver JD, Ritchie ME, Smyth GK (2009). "Microarray background correction: maximum
likelihood estimation for the normal-exponential convolution model." Biostatistics, 10,
Triche TJ, Weisenberger DJ, Berg DVD, Laird PW, Siegmund KD (2013). "Low-level processing of Illumina Innium DNA Methylation BeadArrays." Nucleic Acids Research, pp. 1-11.
Waldron L (2013). pe: Quality assessment and control for FFPR microarray expression
data, r package version 1.4.0 edition.
Waldron L, Ogino S, Hoshida Y, Shima K, Reed AEM, Simpson PT, Baba Y, Nosho K,
Segata N, Vargas AC, Cummings M, Lakhani SR, Kirkner GJ, Giovannucci E, Quackenbush
J, Golub TR, Fuchs CS, Parmigiani G, Huttenhower C (2012). "Expression Proling of
Archival Tumors for Long-term Health Studies." Clinical Cancer Research.
Wu Z, Irizarry RA, Gentleman R, Martinez-Murillo F, Spencer F (2004). "A model-based
background adjustment for oligonucleotide expression arrays." Journal of the American
Statistical Association, 99(468), 909 - 917.
Xie Y, Wang X, Story MD (2009) ."Statistical methods of background correction for Illumina
BeadArray data." Bioinformatics, 25(6), 751-757.
Zhang J (2013). "Reducing the bias of the maximum likelihood estimator of the shape parameter for the gamma Distribution." Computational Statistics, 28, 1715 - 1724.
Zhu W, Zeng N, Wang N (2010). "Sensitivity, Specicity, Accuracy, Associated Condence Interval and ROC Analysis with Practical SAS Implementations."
How to Cite
The Austrian Journal of Statistics publish open access articles under the terms of the Creative Commons Attribution (CC BY) License.
The Creative Commons Attribution License (CC-BY) allows users to copy, distribute and transmit an article, adapt the article and make commercial use of the article. The CC BY license permits commercial and non-commercial re-use of an open access article, as long as the author is properly attributed.
Copyright on any research article published by the Austrian Journal of Statistics is retained by the author(s). Authors grant the Austrian Journal of Statistics a license to publish the article and identify itself as the original publisher. Authors also grant any third party the right to use the article freely as long as its original authors, citation details and publisher are identified.
Manuscripts should be unpublished and not be under consideration for publication elsewhere. By submitting an article, the author(s) certify that the article is their original work, that they have the right to submit the article for publication, and that they can grant the above license.