Motivation: Canonical correlation analysis (CCA) measures the association between two sets of multidimensional variables. We reasoned that CCA could provide an efficient and powerful approach for both univariate and multivariate gene-based tests of association without the need for permutation testing.
Results: Compared with a commonly used permutation-based approach, CCA (i) is faster; (ii) has appropriate type-I error rate for normally distributed quantitative traits; (iii) provides comparable power for small to medium-sized genes (<100 kb); (iv) provides greater power when the causal variants are uncommon; (v) provides considerably less power for larger genes (≥100 kb) when the causal variants have a broad minor allele frequency (MAF) spectrum. Application to a GWAS of leukocyte levels identified SAFB and a histone gene cluster as novel putative loci harboring multiple independent variants regulating lymphocyte and neutrophil counts.
Supplementary information: Supplementary material is available at Bioinformatics online.