An Empirical Bayesian Method for Differential Expression Studies Using One-Channel Microarray Data
Gene expression microarrays have become powerful tools in many areas of biological and biomedical research. These technologies allow researchers to measure the expression levels of thousands of genes in a tissue or cell sample simultaneously. One of the most common types of microarray experiments is simply an exploratory study to compare two samples (e.g. tumor and normal tissue) and look for a list of genes that might be differentially expressed between the two. Differential expression is typically measured by computing a t-statistic or similar statistic for each gene. The genes are then ranked according to the absolute value of the t-statistic, and the twenty or fifty best candidates might be studied in follow-up experiments. When sample sizes are small, the t-statistic can be problematic, because variances are estimated poorly and the "top 20" list is often dominated by the genes with the lowest variance estimates. Lönnstedt and Speed (2001) proposed an empirical Bayes method for avoiding this problem, and showed that their approach has lower false positive and false negative rates than t-statistic-based methods. However, their method was designed mainly for the two-channel microarray, which produces paired data. It is not suitable for technologies such as one-channel arrays that produce unpaired data. We propose a simplification for Lönnstedt and Speed's method, and then extend it to unpaired data with two or more independent treatments. We demonstrate our method on both simulated and real data. When the number of replicates is small, gene rankings based on our statistic appear to be much more reliable than rankings based on the t-statistic.