Outlier Analysis for Gene Expression Data
-
Abstract
The rapid developments of technologies thatgenerate arrays of gene data enable a global view of the transcriptionlevels of hundreds of thousands of genes simultaneously. The outlierdetection problem for gene data has its importance but together withthe difficulty of high dimensionality. The sparsity of data inhigh-dimensional space makes each point a relatively good outlier in theview of traditional distance-based definitions. Thus, finding outliersin high dimensional data is more complex. In this paper,some basic outlier analysis algorithms are discussed and a new geneticalgorithm is presented. This algorithm is to find best dimension projections based on a revised cell-based algorithm and to give explanations to solutions.It can solve the outlier detection problem for gene expression data andfor other high dimensional data as well.
-
-