Adaption of the global test idea to proteomics data with missing values

    loading  Checking for direct PDF access through Ovid

Abstract

Motivation:

Global test procedures are frequently used in gene expression analysis to study the relationship between a functional subset of RNA transcripts and an experimental group factor. However, these procedures have been rarely used for the analysis of high-throughput data from other sources, such as proteome expression data. The main difficulties in transferring global test procedures from genomics to proteomics data are the more complicated way of obtaining functional annotations and the handling of missing values in some types of proteomics data.

Results:

We propose a simple mixed linear model in combination with a permutation procedure and missing values imputation to conduct global tests in proteomics experiments. This new approach is motivated by protein expression data obtained by means of 2-D gel electrophoresis within a mouse experiment of our current research. A simulation study yielded that power and testing level of the mixed model alone can be affected by missing values in the dataset. Imputation of missing values was able to correct for a bias in some simulation settings. Our new approach provides the possibility to rank Gene Ontology (GO) terms associated with protein sets. It is also helpful in the case in which a specific protein is represented by multiple spots on a 2-D gel by considering these spots also as a protein set. Analysis of our data points at correlations between the deficiency of the protein ‘calreticulin’ and protein sets related to biological processes in the heart muscle.

Availability and implementation:

 Our proposed approach is included in the R-package ‘RepeatedHighDim’, which already contains a global test procedure for gene expression data. The package can be retrieved from http://cran.r-project.org/.

Contact:

  klaus.jung@ams.med.uni-goettingen.de

Related Topics

    loading  Loading Related Articles