Supplemental material for L. Omberg, G. H. Golub and O. Alter, "A Tensor Higher-Order Singular Value Decomposition for Integrative Analysis of DNA Microarray Data from Different Studies," Proceedings of the National Academy of Sciences (PNAS) USA 104 (47), pp. 18371–18376 (November 2007); doi: 10.1073/pnas.0709146104.
Abstract:
We describe the use of a higher-order singular value decomposition (HOSVD) in transforming a data tensor of genes × "x-settings," that is, different settings of the experimental variable x × "y-settings," which tabulates DNA microarray data from different studies, to a "core tensor" of "eigenarrays" × "x-eigengenes" × "y-eigengenes." Reformulating this multilinear HOSVD such that it decomposes the data tensor into a linear superposition of all outer products of an eigenarray, an x- and a y-eigengene, that is, rank-1 "subtensors," we define the significance of each subtensor in terms of the fraction of the overall information in the data tensor that it captures. We illustrate this HOSVD with an integration of genome-scale mRNA expression data from three yeast cell cycle time courses, two of which are under exposure to either hydrogen peroxide or menadione. We find that significant subtensors represent independent biological programs or experimental phenomena. The picture that emerges suggests that the conserved genes YKU70, MRE11, AIF1, and ZWF1, and the processes of retrotransposition, apoptosis, and the oxidative pentose phosphate pathway that these genes are involved in, may play significant, yet previously unrecognized, roles in the differential effects of hydrogen peroxide and menadione on cell cycle progression. A genome-scale correlation between DNA replication initiation and RNA transcription, which is equivalent to a recently discovered correlation and might be due to a previously unknown mechanism of regulation, is independently uncovered.



A PDF format file, readable by Adobe Acrobat Reader.
Omberg_et_al_PNAS_2007.pdf



Supplemental figures with captions in PDF format files, readable by Adobe Acrobat Reader.



A Mathematica 5.2 code file, executable by Mathematica.
A PDF format file, readable by Adobe Acrobat Reader.



Tab-delimited text format files, readable by both Mathematica and Microsoft Excel.

Yeast mRNA Expression Dataset

Reproduced from Shapira et al. and Spellman et al.

mRNA Expression Classification of Yeast Genes

Reproduced from Gasch et al., Roberts et al., and Spellman et al.

Transcription Factors' DNA-Binding Classification of Yeast Genes

Reproduced from Harbison et al.

Replication Initiation Proteins' DNA-Binding Classification of Yeast Genes

Reproduced from Wyrick et al. and Simon et al.

Selected Eigenarrays and Superpositions of Eigenarrays