Supplemental material for O. Alter and G. H. Golub, "Integrative Analysis of Genome-Scale Data by Using Pseudoinverse Projection Predicts Novel Correlation Between DNA Replication and RNA Transcription," Proceedings of the National Academy of Sciences (PNAS) USA 101 (47), pp. 16577–16582 (November 2004); doi: 10.1073/pnas.0406767101.
We describe an integrative data-driven mathematical framework that formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the "basis" set. By using pseudoinverse projection, the molecular biological profiles of the data samples are least-squares-approximated as superpositions of the basis profiles. Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis. Classification of the data samples according to their reconstruction in the basis, rather than their overall measured profiles, maps the cellular states of the data onto those of the basis and gives a global picture of the correlations and possibly also causal coordination of these two sets of states. We illustrate this framework with an integration of yeast genome-scale proteins' DNA-binding data with cell cycle mRNA expression time course data. Novel correlation between DNA replication initiation and RNA transcription during the yeast cell cycle, which might be due to a previously unknown mechanism of regulation, is predicted.

A PDF format file, readable by Adobe Acrobat Reader.

A PDF format file, readable by Adobe Acrobat Reader.

Mathematica 5.0 code files, executable by Mathematica.
PDF format files, readable by Adobe Acrobat Reader.

Tab-delimited text format files, readable by both Mathematica and Microsoft Excel.

Yeast RNA Expression Dataset for SVD Basis Computation

Reproduced from Spellman et al. and Alter et al.

Classification List of Yeast Cell Cycle-Regulated Genes

Yeast and Human RNA Expression Datasets for GSVD Basis Computation

Reproduced from Whitfield et al. and Alter et al.

Yeast Proteins' DNA-Binding Dataset for Pseudoinverse Projection Integration With the SVD and GSVD RNA Expression Bases

Reproduced from Simon et al. and Wyrick et al.

Yeast RNA Expression Datasets for Pseudoinverse Projection Integration with the SVD and GSVD RNA Expression Bases