Combining Gene Expression and Molecular Marker Information for Mapping Complex Trait Genes: A Simulation Study

M. Pérez-Enciso, D. Gianola, M. TENENHAUS, M. A. Toro


2003, vol. 164, pp.1597-1606

Départements : Economie et Sciences de la décision

A method for mapping complex trait genes using cDNA microarray and molecular marker data jointly is presented and illustrated via simulation. We introduce a novel approach for simulating phenotypes and genotypes conditionally on real, publicly available, microarray data. The model assumes an underlying continuous latent variable (liability) related to some measured cDNA expression levels. Partial least-squares logistic regression is used to estimate the liability under several scenarios where the level of gene interaction, the gene effect, and the number of cDNA levels affecting liability are varied. The results suggest that: (1) the usefulness of microarray data for gene mapping increases when both the number of cDNA levels in the underlying liability and the QTL effect decrease and when genes are coexpressed; (2) the correlation between estimated and true liability is large, at least under our simulation settings; (3) it is unlikely that cDNA clones identified as significant with partial least squares (or with some other technique) are the true responsible cDNAs, especially as the number of clones in the liability increases; (4) the number of putatively significant cDNA levels increases critically if cDNAs are coexpressed in a cluster (however, the proportion of true causal cDNAs within the significant ones is similar to that in a no-coexpression scenario); and (5) data reduction is needed to smooth out the variability encountered in expression levels when these are analyzed individually.